linux-uek icon indicating copy to clipboard operation
linux-uek copied to clipboard

SR-IOV broken for Connect-X4 in ethernet mode since UEK7.2

Open sdjerdj opened this issue 2 years ago • 8 comments

Description: Since the UEK7u2, SR-IOV is broken for Mellanox ConnectX4, if one or both ports are configured for ethernet mode. The same works fine if the ports are configure for IB mode. Additionally, everything is fine on UEK7u1 regardless of the configuration.

Diagnostic info: Port 0 is in IB mode Port 1 is in ETH mode

Output of lspci :

lspci |grep Mellanox

0b:00.0 Infiniband controller: Mellanox Technologies MT27700 Family [ConnectX-4] 0b:00.1 Ethernet controller: Mellanox Technologies MT27700 Family [ConnectX-4] 0b:00.2 Infiniband controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function] 0b:00.3 Infiniband controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function] 0b:00.4 Infiniband controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function] 0b:00.5 Infiniband controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function] 0b:00.6 Infiniband controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function] 0b:00.7 Infiniband controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function]

Output of from dmesg:

dmesg |grep mlx

[ 1.349024] mlx5_core 0000:0b:00.0: firmware version: 12.28.2006 [ 1.349050] mlx5_core 0000:0b:00.0: 63.008 Gb/s available PCIe bandwidth (8.0 GT/s PCIe x8 link) [ 4.071873] mlx5_core 0000:0b:00.0: Port module event: module 0, Cable plugged [ 4.265254] mlx5_core 0000:0b:00.1: firmware version: 12.28.2006 [ 4.265304] mlx5_core 0000:0b:00.1: 63.008 Gb/s available PCIe bandwidth (8.0 GT/s PCIe x8 link) [ 4.683438] mlx5_core 0000:0b:00.1: E-Switch: Total vports 10, per vport: max uc(1024) max mc(16384) [ 4.687324] mlx5_core 0000:0b:00.1: Port module event: module 1, Cable plugged [ 4.932149] mlx5_core 0000:0b:00.1: Supported tc offload range - chains: 4294967294, prios: 4294967295 [ 4.941115] mlx5_core 0000:0b:00.1: MLX5E: StrdRq(0) RqSz(1024) StrdSz(256) RxCqeCmprss(0 basic) [ 4.943224] mlx5_core 0000:0b:00.1 enp11s0f1np1: renamed from eth0 [ 5.655608] mlx5_core 0000:0b:00.1: Successfully registered panic handler for port 1 [ 5.773746] mlx5_core 0000:0b:00.1: mlx5_cmd_out_err:803:(pid 814): QUERY_HCA_CAP(0x100) op_mod(0x40) failed, status bad parameter(0x3), syndrome (0x5add95), err(-22) [ 5.787044] mlx5_core 0000:0b:00.1: mlx5_device_enable_sriov:82:(pid 814): failed to enable eswitch SRIOV (-22) [ 5.787047] mlx5_core 0000:0b:00.1: mlx5_sriov_enable:168:(pid 814): mlx5_device_enable_sriov failed : -22 [ 5.787222] mlx5_core 0000:0b:00.1 mlxen0: renamed from enp11s0f1np1 [ 6.213460] mlx5_core 0000:0b:00.0 mlxib0: renamed from ib0 [ 7.054459] mlx5_core 0000:0b:00.1 mlxen0: Link up [ 7.057046] 8021q: adding VLAN 0 to HW filter on device mlxen0 [ 8.055388] IPv6: ADDRCONF(NETDEV_CHANGE): mlxen0: link becomes ready [ 8.284101] IPv6: ADDRCONF(NETDEV_CHANGE): mlxib0: link becomes ready [ 8.557689] IPv6: ADDRCONF(NETDEV_CHANGE): mlxib0: link becomes ready [ 8.580025] br1: port 1(mlxen0) entered blocking state [ 8.580028] br1: port 1(mlxen0) entered disabled state [ 8.580067] device mlxen0 entered promiscuous mode [ 8.580957] br1: port 1(mlxen0) entered blocking state [ 8.580960] br1: port 1(mlxen0) entered listening state [ 8.595823] mlx5_core 0000:0b:00.1: mlx5e_fs_set_rx_mode_work:843:(pid 156): S-tagged traffic will be dropped while C-tag vlan stripping is enabled [ 10.635172] br1: port 1(mlxen0) entered learning state [ 25.931048] br1: port 1(mlxen0) entered forwarding state

Manually trying to add VFs to the ETH port results with the following error:

echo 0 > /sys/class/net/mlxen0/device/sriov_numvfs echo 7 > /sys/class/net/mlxen0/device/sriov_numvfs -bash: echo: write error: Invalid argument

The same works just fine for the IB port:

echo 0 > /sys/class/net/mlxib0/device/sriov_numvfs echo 7 > /sys/class/net/mlxib0/device/sriov_numvfs

sdjerdj avatar Nov 21 '23 03:11 sdjerdj

Could this be related ? https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/commit/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c?id=6496357aa5f710eec96f91345b9da1b37c3231f6

sdjerdj avatar Nov 21 '23 03:11 sdjerdj

I have confirmed that the same setup works properly with the latest RHCK kernel that comes with OL9.3 (5.14.0-362.8.1.el9_3.x86_64)

sdjerdj avatar Nov 23 '23 05:11 sdjerdj

I have re-tested this with 5.15.0-202.135.2.el9uek.x86_64, the issue is still present lspci shows the IB port VF's present but not for the ETH port:

lspci

0e:00.0 Infiniband controller: Mellanox Technologies MT27700 Family [ConnectX-4] 0e:00.1 Ethernet controller: Mellanox Technologies MT27700 Family [ConnectX-4] 0e:00.2 Infiniband controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function] 0e:00.3 Infiniband controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function] 0e:00.4 Infiniband controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function] 0e:00.5 Infiniband controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function] 0e:00.6 Infiniband controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function] 0e:00.7 Infiniband controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function] 0e:01.0 Infiniband controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function] 0e:01.1 Infiniband controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function] 0e:01.2 Infiniband controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function] 0e:01.3 Infiniband controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function] 0e:01.4 Infiniband controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function] 0e:01.5 Infiniband controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function] 0e:01.6 Infiniband controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function] 0e:01.7 Infiniband controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function]

dmesg shows the following:

dmesg |grep mlx

[ 1.364318] mlx5_core 0000:0e:00.0: firmware version: 12.28.2006 [ 1.364344] mlx5_core 0000:0e:00.0: 63.008 Gb/s available PCIe bandwidth (8.0 GT/s PCIe x8 link) [ 4.090382] mlx5_core 0000:0e:00.0: Port module event: module 0, Cable plugged [ 4.284359] mlx5_core 0000:0e:00.1: firmware version: 12.28.2006 [ 4.284398] mlx5_core 0000:0e:00.1: 63.008 Gb/s available PCIe bandwidth (8.0 GT/s PCIe x8 link) [ 4.704293] mlx5_core 0000:0e:00.1: E-Switch: Total vports 17, per vport: max uc(1024) max mc(16384) [ 4.708315] mlx5_core 0000:0e:00.1: Port module event: module 1, Cable plugged [ 4.958517] mlx5_core 0000:0e:00.1: Supported tc offload range - chains: 4294967294, prios: 4294967295 [ 4.967739] mlx5_core 0000:0e:00.1: MLX5E: StrdRq(0) RqSz(1024) StrdSz(256) RxCqeCmprss(0 basic) [ 4.969721] mlx5_core 0000:0e:00.1 enp14s0f1np1: renamed from eth0 [ 5.677001] mlx5_core 0000:0e:00.1: Successfully registered panic handler for port 1 [ 5.825806] mlx5_core 0000:0e:00.1: mlx5_cmd_out_err:803:(pid 805): QUERY_HCA_CAP(0x100) op_mod(0x40) failed, status bad parameter(0x3), syndrome (0x5add95), err(-22) [ 5.844531] mlx5_core 0000:0e:00.1: mlx5_device_enable_sriov:82:(pid 805): failed to enable eswitch SRIOV (-22) [ 5.844535] mlx5_core 0000:0e:00.1: mlx5_sriov_enable:168:(pid 805): mlx5_device_enable_sriov failed : -22 [ 5.844723] mlx5_core 0000:0e:00.1 mlxen0: renamed from enp14s0f1np1 [ 6.185318] mlx5_core 0000:0e:00.0 mlxib0: renamed from ib0 [ 7.042738] mlx5_core 0000:0e:00.1 mlxen0: Link up [ 7.045067] 8021q: adding VLAN 0 to HW filter on device mlxen0 [ 7.076354] IPv6: ADDRCONF(NETDEV_CHANGE): mlxen0: link becomes ready [ 8.282229] IPv6: ADDRCONF(NETDEV_CHANGE): mlxib0: link becomes ready [ 8.567892] br1: port 1(mlxen0) entered blocking state [ 8.567894] br1: port 1(mlxen0) entered disabled state [ 8.567923] device mlxen0 entered promiscuous mode [ 8.568824] br1: port 1(mlxen0) entered blocking state [ 8.568826] br1: port 1(mlxen0) entered listening state [ 8.584119] mlx5_core 0000:0e:00.1: mlx5e_fs_set_rx_mode_work:843:(pid 137): S-tagged traffic will be dropped while C-tag vlan stripping is enabled [ 10.572198] br1: port 1(mlxen0) entered learning state [ 25.931080] br1: port 1(mlxen0) entered forwarding state

sdjerdj avatar Jan 14 '24 16:01 sdjerdj

This issue seemed to be caused by the combination of an older FW (doesn't support querying hca_cap_2 bit) and a newer upstream mlx5 driver. Hence the failure of FW CMD: QUERY_HCA_CAP(0x100) was seen in the log.

If you can update the firmware that would be best. Alternatively wait until we have a kernel with 6496357aa5f7 ("net/mlx5: Query hca_cap_2 only when supported")

Thx!

konradwilk avatar Jan 18 '24 20:01 konradwilk

Thank you for the update! No worries - Until the UEK kernel gets patches, I can use the RHCK kernel, which works as expected. Regarding the firmware, the card has the latest one available for this card.

sdjerdj avatar Jan 20 '24 02:01 sdjerdj

Here is a bit of good news: I managed to apply the above patch on the top of 5.15.0-203.146.3 UEK kernel and I'm happy to report that the patch indeed resolves the issue:

[root@ol9 ~]# uname -r 5.15.0-203.146.888.el9uek.x86_64 <<== Patched test kernel [root@ol9 ~]#

[root@ol9 ~]# dmesg |grep -e mlx -e eswitch [ 1.352332] mlx5_core 0000:0e:00.0: firmware version: 12.28.2006 [ 1.352359] mlx5_core 0000:0e:00.0: 63.008 Gb/s available PCIe bandwidth (8.0 GT/s PCIe x8 link) [ 4.075247] mlx5_core 0000:0e:00.0: Port module event: module 0, Cable plugged [ 4.268187] mlx5_core 0000:0e:00.1: firmware version: 12.28.2006 [ 4.268227] mlx5_core 0000:0e:00.1: 63.008 Gb/s available PCIe bandwidth (8.0 GT/s PCIe x8 link) [ 4.689720] mlx5_core 0000:0e:00.1: E-Switch: Total vports 17, per vport: max uc(1024) max mc(16384) [ 4.693997] mlx5_core 0000:0e:00.1: Port module event: module 1, Cable plugged [ 4.934593] mlx5_core 0000:0e:00.1: Supported tc offload range - chains: 4294967294, prios: 4294967295 [ 4.943461] mlx5_core 0000:0e:00.1: MLX5E: StrdRq(0) RqSz(1024) StrdSz(256) RxCqeCmprss(0 basic) [ 4.945410] mlx5_core 0000:0e:00.1 enp14s0f1np1: renamed from eth0 [ 5.647576] mlx5_core 0000:0e:00.1: Successfully registered panic handler for port 1 [ 6.032963] mlx5_core 0000:0e:00.1: E-Switch: Enable: mode(LEGACY), nvfs(14), active vports(15) [ 6.091724] mlx5_core 0000:0e:00.0 mlxib0: renamed from ib0 [ 6.172684] mlx5_core 0000:0e:00.1 mlxen0: renamed from enp14s0f1np1 [ 7.012632] mlx5_core 0000:0e:00.1 mlxen0: Link up [ 7.014728] 8021q: adding VLAN 0 to HW filter on device mlxen0 [ 7.048631] IPv6: ADDRCONF(NETDEV_CHANGE): mlxen0: link becomes ready [ 8.234249] IPv6: ADDRCONF(NETDEV_CHANGE): mlxib0: link becomes ready [ 8.496572] IPv6: ADDRCONF(NETDEV_CHANGE): mlxib0: link becomes ready [ 8.517091] br1: port 1(mlxen0) entered blocking state [ 8.517094] br1: port 1(mlxen0) entered disabled state [ 8.517130] device mlxen0 entered promiscuous mode [ 8.518280] br1: port 1(mlxen0) entered blocking state [ 8.518281] br1: port 1(mlxen0) entered listening state [ 8.535401] mlx5_core 0000:0e:00.1: mlx5e_fs_set_rx_mode_work:843:(pid 146): S-tagged traffic will be dropped while C-tag vlan stripping is enabled [ 10.572228] br1: port 1(mlxen0) entered learning state [ 25.932102] br1: port 1(mlxen0) entered forwarding state [root@ol9 ~]#

[root@ol9 ~]# lspci |grep Mellanox 0e:00.0 Infiniband controller: Mellanox Technologies MT27700 Family [ConnectX-4] 0e:00.1 Ethernet controller: Mellanox Technologies MT27700 Family [ConnectX-4] 0e:00.2 Infiniband controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function] 0e:00.3 Infiniband controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function] 0e:00.4 Infiniband controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function] 0e:00.5 Infiniband controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function] 0e:00.6 Infiniband controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function] 0e:00.7 Infiniband controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function] 0e:01.0 Infiniband controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function] 0e:01.1 Infiniband controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function] 0e:01.2 Infiniband controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function] 0e:01.3 Infiniband controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function] 0e:01.4 Infiniband controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function] 0e:01.5 Infiniband controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function] 0e:01.6 Infiniband controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function] 0e:01.7 Infiniband controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function] 0e:02.1 Ethernet controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function] 0e:02.2 Ethernet controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function] 0e:02.3 Ethernet controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function] 0e:02.4 Ethernet controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function] 0e:02.5 Ethernet controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function] 0e:02.6 Ethernet controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function] 0e:02.7 Ethernet controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function] 0e:03.0 Ethernet controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function] 0e:03.1 Ethernet controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function] 0e:03.2 Ethernet controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function] 0e:03.3 Ethernet controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function] 0e:03.4 Ethernet controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function] 0e:03.5 Ethernet controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function] 0e:03.6 Ethernet controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function] [root@ol9 ~]#

Hopefully this info will make the inclusion of the patch above somewhat easier in the upcoming versions of the UEK kernel

sdjerdj avatar Jan 26 '24 03:01 sdjerdj

Perfect. Will close this ticket once a new UEK kernel comes out with the backport.

konradwilk avatar Jan 26 '24 18:01 konradwilk

Just curious if there is any progress on this issue ? The upstream kernel has this patch since July of 2023.

sdjerdj avatar Apr 01 '24 00:04 sdjerdj

We'll get this into the next possible UEK7 errata release. Sorry for the delay.

aron-silverton avatar Apr 01 '24 18:04 aron-silverton

This is currently scheduled for the May monthly errata release.

aron-silverton avatar Apr 17 '24 18:04 aron-silverton

Hello,

Just a quick update: It looks like the 5.15.0-206.153.7.el9uek.x86_64 kernel has resolved the issue:

$ lspci |grep Mellanox 0e:00.0 Infiniband controller: Mellanox Technologies MT27700 Family [ConnectX-4] 0e:00.1 Ethernet controller: Mellanox Technologies MT27700 Family [ConnectX-4] 0e:00.2 Infiniband controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function] 0e:00.3 Infiniband controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function] 0e:00.4 Infiniband controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function] 0e:00.5 Infiniband controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function] 0e:00.6 Infiniband controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function] 0e:00.7 Infiniband controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function] 0e:01.0 Infiniband controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function] 0e:01.1 Infiniband controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function] 0e:01.2 Infiniband controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function] 0e:01.3 Infiniband controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function] 0e:01.4 Infiniband controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function] 0e:01.5 Infiniband controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function] 0e:01.6 Infiniband controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function] 0e:01.7 Infiniband controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function] 0e:02.1 Ethernet controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function] 0e:02.2 Ethernet controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function] 0e:02.3 Ethernet controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function] 0e:02.4 Ethernet controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function] 0e:02.5 Ethernet controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function] 0e:02.6 Ethernet controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function] 0e:02.7 Ethernet controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function] 0e:03.0 Ethernet controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function] 0e:03.1 Ethernet controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function] 0e:03.2 Ethernet controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function] 0e:03.3 Ethernet controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function] 0e:03.4 Ethernet controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function] 0e:03.5 Ethernet controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function] 0e:03.6 Ethernet controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function] $

$ dmesg |grep mlx5_core [ 1.568472] mlx5_core 0000:0e:00.0: firmware version: 12.28.2006 [ 1.568500] mlx5_core 0000:0e:00.0: 63.008 Gb/s available PCIe bandwidth (8.0 GT/s PCIe x8 link) [ 4.297432] mlx5_core 0000:0e:00.0: Port module event: module 0, Cable plugged [ 4.491212] mlx5_core 0000:0e:00.1: firmware version: 12.28.2006 [ 4.491257] mlx5_core 0000:0e:00.1: 63.008 Gb/s available PCIe bandwidth (8.0 GT/s PCIe x8 link) [ 4.912725] mlx5_core 0000:0e:00.1: E-Switch: Total vports 17, per vport: max uc(1024) max mc(16384) [ 4.916861] mlx5_core 0000:0e:00.1: Port module event: module 1, Cable plugged [ 5.160445] mlx5_core 0000:0e:00.1: Supported tc offload range - chains: 4294967294, prios: 4294967295 [ 5.169426] mlx5_core 0000:0e:00.1: MLX5E: StrdRq(0) RqSz(1024) StrdSz(256) RxCqeCmprss(0 basic) [ 5.171692] mlx5_core 0000:0e:00.1 enp14s0f1np1: renamed from eth0 [ 6.024422] mlx5_core 0000:0e:00.1: Successfully registered panic handler for port 1 [ 6.406512] mlx5_core 0000:0e:00.1: E-Switch: Enable: mode(LEGACY), nvfs(14), active vports(15) $

sdjerdj avatar May 15 '24 01:05 sdjerdj

Hi, it was actually fixed back in https://github.com/oracle/linux-uek/commits/v5.15.0-206.149.3, but I had been waiting to update this issue until we published the RPMs. They also appear to be available now. Thank you for your patience.

aron-silverton avatar May 15 '24 21:05 aron-silverton