-
Notifications
You must be signed in to change notification settings - Fork 299
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Driver for Mellanox Connect-X 4/5/6 cards #1461
Conversation
This function can be useful for resetting a device that has persistent state, for example the firmware state on a Mellanox ConnectX-4 device.
This is useful because some differences that are subtle when comparing source code are obvious when comparing hexdumps. If the card does not respond to a command the way we expect then we can check what we are doing differently to the Linux driver.
Commands are being successfully executed towards the card. The full initialization procedure is not in place yet. Support for commands that span multiple input/output pages needs to be implemented. Current expected behavior when running the selftest is to successfully execute the commands ENABLE_HCA, QUERY_ISSI, QUERY_PAGES, MANAGE_PAGES, and then to fail in QUERY_HCA_CAP (likely because it has multipage output).
The physical address of DMA memory can be determined at runtime (cheaply and reliably) using memory.virtual_to_physical(). Now we do this whenever we need a physical address rather than caching the value returned by memory.dma_alloc(). Just means less state to keep track of in our data structures.
Now it is possible to request specific alignment for DMA memory. This is practical. For example, Mellanox ConnectX-4 requires specific alignments (e.g. 4KB).
Alignment was already checked with an assertion but this would not necessarily succeed.
Command inputs and outputs are now split into multiple chained mailbox records that each hold up to 512 bytes of data. This is mandatory for large messages.
Maybe more work needed to correctly interpret the result.
Partial implementation of the initialization procedure.
Complete debug messages have become a little overwhelming now that we are allocating thousands of pages of memory for the adapter. Just for the moment disabling the hexdumps is the more sensible default. More fine-grained debug logging is likely needed.
Refactored the error checking to always be done when posting a command to the command queue. Previously this was a manual step for each command and that seems more error prone.
Now successfully: - Providing boot memory to the adapter (6 pages) - Querying adapter capabilities (current and maximum) - Setting adapter capabilities (keep current) - Providing init memory to the adapter (4232 pages !) The output from the init sequence looks like this: TRACE Read the initialization segment TRACE Write the physical location of the command queues to the init segment. TRACE Wait for the 'initializing' field to clear fw_rev 14 12 1220 cmd_interface_rev 5 cmdq_phy_addr cdata<void *>: 0x1f000000 log_cmdq_size 5 log_cmdq_stride 6 ready true nic_interface_supported true internal_timer 2.0108995831647e+14 health_syndrome 0 Command: ENABLE_HCA Command: QUERY_ISSI cur_issi = 0 sup_issi = 01 Command: QUERY_PAGES query_pages'boot' 6 Command: MANAGE_PAGES Command: QUERY_HCA_CAP Command: QUERY_HCA_CAP Capabilities - current and (maximum): eth_net_offloads = 0 (0) end_pad = 1 (1) cq_eq_remap = 1 (1) device_frequency_mhz = 275 (275) log_max_vlan_list = 12 (12) log_min_stride_sz_rq = 0 (0) log_max_klm_list_size = 16 (16) log_max_rqt = 0 (0) log_max_l2_table = 16 (16) log_max_current_uc_list = 10 (10) log_min_stride_sz_sq = 0 (0) log_uar_page_sz = 0 (8) log_max_wq_sz = 0 (0) log_max_current_mc_list = 14 (14) log_max_msg = 30 (30) log_max_stride_sz_rq = 0 (0) max_flow_counter = 0 (0) log_max_eq_sz = 22 (22) log_max_rqt_size = 0 (0) basic_cyclic_rcv_wqe = 0 (0) cache_line_128byte = 0 (0) max_tc = 0 (0) cmdif_checksum = 0 (3) driver_version = 0 (0) log_max_tis = 0 (0) port_type = 1 (1) wq_signature = 1 (1) log_max_tir = 0 (0) max_indirection = 4 (4) log_max_rq = 0 (0) cq_resize = 1 (1) cq_oi = 1 (1) cq_moderation = 1 (1) log_max_pd = 24 (24) log_max_mkey = 24 (24) log_max_transport_domain = 0 (0) rc = 1 (1) num_ports = 1 (1) bf = 1 (1) vport_counters = 1 (1) log_max_eq = 8 (8) pad_tx_eth_packet = 0 (0) log_pg_sz = 12 (12) uar_sz = 5 (5) cq_period_start_from_cqe = 1 (1) uc = 1 (1) log_max_mrw_sz = 64 (64) log_max_cq = 24 (24) vport_group_manager = 1 (1) log_max_tis_per_sq = 0 (0) start_pad = 0 (0) log_max_cq_sz = 22 (22) nic_flow_table = 0 (0) scqe_break_moderation = 1 (1) ud = 1 (1) log_max_sq = 0 (0) cqe_version = 0 (0) log_bf_reg_size = 9 (9) sctr_data_cqe = 1 (1) log_max_rmp = 0 (0) cqe_version = 0 (0) log_bf_reg_size = 9 (9) sctr_data_cqe = 1 (1) log_max_rmp = 0 (0) log_max_stride_sz_sq = 0 (0) imaicl = 0 (0) xrc = 1 (1) Command: SET_HCA_CAP Command: QUERY_PAGES query_pages'init' 4232 Command: MANAGE_PAGES Command: INIT_HCA
Pulls in more of the initialization procedure. See especially commit 4575dc7.
Added QUERY_VPORT_STATE, MODIFY_VPORT_STATE, QUERY_NIC_VPORT_CONTEXT. Note: I am not sure that these commands are actually needed since we are not using SR-IOV. The PRM mandates using some VPORT commands but I don't see them in the trace from the Linux mlx5 driver. So we may be able to remove this code.
Completed Mellanox initialization sequence
Required argument for new code merged from master in v2016.06. Request exclusive lock on the device.
This commit introduces a clean and working version of the device initialization.
This reverts commit 6a357a7.
…nsmit This is a bug where the physical addresses wider that 53 bits of payloads inserted into descriptors for DMA are truncated. The fix here is to truncate after masking. Probably better would be to use lib.htonl instead of bswap(tonumber(...)) throughout the driver.
This reverts commit d20c2fa.
…ckets) Merge remote-tracking branch 'alexandergall/mellanox' into mellanox-lwaftr # Conflicts: # src/apps/mellanox/connectx.lua
# Conflicts: # src/apps/mellanox/connectx.lua
apps.connectx: use lib.macaddress instead of ptoi
Great going Max :) |
Seems that the driver also works for at least some Connect-X 6 cards, which is cool. We should extend the tests a bit more before merging, and I think a blocker that remains is proper support for
|
Added support for I noticed that local-loopback between queues is currently not implemented. I tend towards leaving that as a TBD. (Couldn’t quite figure out how to enable it from a quick survey of the PRM.) |
Finally a PR to include all the hard work by @capr @lukego @alexandergall and myself in the next release (TBA)!
This work was made possible by Mellanox releasing a public Programmer’s Reference Manual (PRM) for their Connect-X 4 cards!
This driver can operate Mellanox Connect-X 4 and 5 cards, and has been tested at SWITCH and in lwAFTR.
It supports RSS and MAC+VLAN switching (and combinations thereof). Currently not supported is VLAN insertion and stripping (use
apps.vlan
).There is a branch eugeneia/snabb@mellanox-2021...eugeneia:mellanox-2021-vlan-strip-insert that implements VLAN stripping and insertion in the driver, but I’m not sure if the extra code is worth its weight or if its cleaner to delegate to
apps.vlan
instead. Opinions welcome!