ibsim icon indicating copy to clipboard operation
ibsim copied to clipboard

Segmentation Fault on OpenSM during IB Ping When using Ibsim

Open ax75 opened this issue 1 year ago • 0 comments

I setup ibsim using mofed drivers, started the ibsim with net-example sample topology. When trying to perform ibping with any LID number the opensm crashes with segmentation error.

OpenSM & Ibsim Version OpenSM 5.17.2.MLNX20240610.dc7c2998

ibsim 0.12

[CRASHD] [CRASHD] ================================= [CRASHD] = OpenSM crash daemon = [CRASHD] ================================= [CRASHD] [CRASHD] = OpenSM binary: /usr/sbin/opensm [CRASHD] = OpenSM process ID: 5219 [CRASHD] = OpenSM thread ID: 5234 [CRASHD] = Exception: Segmentation fault [CRASHD] = Reason: address not mapped to object [CRASHD] = Fault address: 0x40 [CRASHD] = Signal error: Success [CRASHD] = Last error: Success [CRASHD] = Stack trace addresses: [CRASHD] = [#00] 0x555731e2b7a6 [CRASHD] = ??:0 - ??() [CRASHD] = [#01] 0x7f4420c9f520 [CRASHD] = ??:0 - ??() [CRASHD] = [#02] 0x7f4420cf4ef4 [CRASHD] = ??:0 - ??() [CRASHD] = [#03] 0x7f4421217fc3 [CRASHD] = ??:0 - ??() [CRASHD] = [#04] 0x7f4420cf1ac3 [CRASHD] = ??:0 - ??() [CRASHD] = [#05] 0x7f4420d83850 [CRASHD] = ??:0 - ??() [CRASHD] = Backtrace symbols: ==== [CRASHD] ==== opensm(fault_handler+0x9c)[0x555731e2b7a6] /lib/x86_64-linux-gnu/libc.so.6(+0x42520)[0x7f4420c9f520] /lib/x86_64-linux-gnu/libc.so.6(pthread_mutex_lock+0x4)[0x7f4420cf4ef4] /usr/lib/umad2sim/libumad2sim.so(+0x2fc3)[0x7f4421217fc3] /lib/x86_64-linux-gnu/libc.so.6(+0x94ac3)[0x7f4420cf1ac3] /lib/x86_64-linux-gnu/libc.so.6(+0x126850)[0x7f4420d83850] ==== [CRASHD] ====

Aborted


I am getting the response with ibnetdiscover,

ax75@infiniband-01:~$ sudo ibsim-run ibnetdiscover ibwarn: [5359] sim_connect: attached as client 5 at node "Switch1"

Topology file: generated on Wed Aug 28 22:53:58 2024

Initiated from node 0000000000200000 port 0000000000200000

vendid=0x0 devid=0x0 sysimgguid=0x200001 switchguid=0x200001(200001) Switch 8 "S-0000000000200001" # "Switch2" base port 0 lid 163 lmc 0 [3] "S-0000000000200000"[3] # "Switch1" lid 160 4xSDR [4] "S-0000000000200000"[4] # "Switch1" lid 160 4xSDR

vendid=0x0 devid=0x0 sysimgguid=0x200000 switchguid=0x200000(200000) Switch 8 "S-0000000000200000" # "Switch1" base port 0 lid 160 lmc 0 [1] "H-0000000000100000"1 # "Hca1" lid 161 4xSDR [2] "H-0000000000100003"2 # "Hca2" lid 164 4xSDR [3] "S-0000000000200001"[3] # "Switch2" lid 163 4xSDR [4] "S-0000000000200001"[4] # "Switch2" lid 163 4xSDR

vendid=0x0 devid=0x0 sysimgguid=0x100003 caguid=0x100003 Ca 2 "H-0000000000100003" # "Hca2" 2 "S-0000000000200000"[2] # lid 164 lmc 0 "Switch1" lid 160 4xSDR

vendid=0x0 devid=0x0 sysimgguid=0x100000 caguid=0x100000 Ca 2 "H-0000000000100000" # "Hca1" 1 "S-0000000000200000"[1] # lid 161 lmc 0 "Switch1" lid 160 4xSDR

ax75 avatar Aug 28 '24 22:08 ax75