MoonGen icon indicating copy to clipboard operation
MoonGen copied to clipboard

Mellanox 100G interface segmentation fault at the receiver side

Open edgar-costa opened this issue 4 years ago • 0 comments

Hi all,

By following all the issues and docs I managed to get some 100G mellanox (MT28800) apparently work. I installed the OFED LINUX 4.9.2.2.4 drivers and built moongen with the --mlx5 flag as far as I can tell I did not see any error.

I tested the sending side with the libmoon/examples/pktgen.lua script and I manage to get 100gbps sending with 2 ports at the same time and quite small packet sizes.

However, the problem comes when I add a receiver. It always segment faults no matter what I put in the receiving code. I noticed that it receives some packets and thus I made a program to count them, or to see if there was something with the packet content:

function recv(queue, id)
    print("Listening....")
    local bufs = memory.bufArray()
    local count = 0

    while mg.running() do
        local rx = queue:recv(bufs)
        for i = 1, rx do
            count = count + 1
            local pkt = bufs[i]:getUdp4Packet()
            print(pkt.eth:getString() .. " " .. count)

        end
        bufs:freeAll()
    end
    print(id .. " Total Packets Received: " .. count)
end

I observed that after receiving the amount of rxDescs packets it crashes with segmentation fault. For example with a rxdescs of 512 the program above prints:

...
ETH 98:03:9b:4d:d7:9c > 98:03:9b:4d:d7:9d type 0x0800 (IP4) 503
ETH 98:03:9b:4d:d7:9c > 98:03:9b:4d:d7:9d type 0x0800 (IP4) 504
ETH 98:03:9b:4d:d7:9c > 98:03:9b:4d:d7:9d type 0x0800 (IP4) 505
ETH 98:03:9b:4d:d7:9c > 98:03:9b:4d:d7:9d type 0x0800 (IP4) 506
ETH 98:03:9b:4d:d7:9c > 98:03:9b:4d:d7:9d type 0x0800 (IP4) 507
ETH 98:03:9b:4d:d7:9c > 98:03:9b:4d:d7:9d type 0x0800 (IP4) 508
ETH 98:03:9b:4d:d7:9c > 98:03:9b:4d:d7:9d type 0x0800 (IP4) 509
ETH 98:03:9b:4d:d7:9c > 98:03:9b:4d:d7:9d type 0x0800 (IP4) 510
ETH 98:03:9b:4d:d7:9c > 98:03:9b:4d:d7:9d type 0x0800 (IP4) 511
ETH 98:03:9b:4d:d7:9c > 98:03:9b:4d:d7:9d type 0x0800 (IP4) 512
Segmentation fault

Or if i change it to lets say 4096:

...
ETH 98:03:9b:4d:d7:9c > 98:03:9b:4d:d7:9d type 0x0800 (IP4) 4088
ETH 98:03:9b:4d:d7:9c > 98:03:9b:4d:d7:9d type 0x0800 (IP4) 4089
ETH 98:03:9b:4d:d7:9c > 98:03:9b:4d:d7:9d type 0x0800 (IP4) 4090
ETH 98:03:9b:4d:d7:9c > 98:03:9b:4d:d7:9d type 0x0800 (IP4) 4091
ETH 98:03:9b:4d:d7:9c > 98:03:9b:4d:d7:9d type 0x0800 (IP4) 4092
ETH 98:03:9b:4d:d7:9c > 98:03:9b:4d:d7:9d type 0x0800 (IP4) 4093
ETH 98:03:9b:4d:d7:9c > 98:03:9b:4d:d7:9d type 0x0800 (IP4) 4094
ETH 98:03:9b:4d:d7:9c > 98:03:9b:4d:d7:9d type 0x0800 (IP4) 4095
Segmentation fault

Any idea of what could the problem be?

edgar-costa avatar Jun 14 '21 17:06 edgar-costa