infinite loop inside `vmp_walk_and_record_stack`
I have a PyPy process which is being profiled with vmprof.enable(native=False).
I noticed it was consuming 100% of CPU, so I gdbed into it. This is what I found:
(gdb) where
#0 0x00007f5d7164dc10 in vmp_walk_and_record_stack () from /opt/pypy-c-jit-92954-e4158aeecc04-linux64/bin/libpypy-c.so
#1 0x00007f5d7164fd05 in sigprof_handler () from /opt/pypy-c-jit-92954-e4158aeecc04-linux64/bin/libpypy-c.so
#2 <signal handler called>
#3 0x00007f5d6e8932a3 in _int_free (av=0x7f5d6ebd7b20 <main_arena>, p=0x3596ecc0, have_lock=0) at malloc.c:3962
#4 0x00007f5d6e89753c in __GI___libc_free (mem=<optimized out>) at malloc.c:2968
#5 0x00007f5d71644fea in g_restore_state () from /opt/pypy-c-jit-92954-e4158aeecc04-linux64/bin/libpypy-c.so
#6 0x00007f5d71645048 in _stacklet_switchstack.constprop.3 () from /opt/pypy-c-jit-92954-e4158aeecc04-linux64/bin/libpypy-c.so
#7 0x00007fffcef58270 in ?? ()
#8 0x00007f5d6b9a7a60 in ?? ()
#9 0x0000000035a1a320 in ?? ()
#10 0x000000003583c7a8 in ?? ()
#11 0x00007f5d73375010 in ?? ()
#12 0x00000000026aa640 in ?? ()
#13 0x0000000000000260 in ?? ()
#14 0x00007f5d716453c7 in stacklet_switch () from /opt/pypy-c-jit-92954-e4158aeecc04-linux64/bin/libpypy-c.so
#15 0x00007f5d7144b388 in pypy_g.switch () from /opt/pypy-c-jit-92954-e4158aeecc04-linux64/bin/libpypy-c.so
#16 0x00007f5d714276bd in pypy_g_StackletThread_switch () from /opt/pypy-c-jit-92954-e4158aeecc04-linux64/bin/libpypy-c.so
#17 0x00007f5d5e0cbfe7 in ?? ()
#18 0x00007f5d6b2db740 in ?? ()
#19 0x00007f5d70f99783 in pypy_g_crash_in_jit_4 () from /opt/pypy-c-jit-92954-e4158aeecc04-linux64/bin/libpypy-c.so
so, from this it seems that the vmprof signal triggered in the middle of a greenlet switch. Then, I looked at the disassembly:
(gdb)
Dump of assembler code for function vmp_walk_and_record_stack:
0x00007f5d7164dbc0 <+0>: push %r15
0x00007f5d7164dbc2 <+2>: push %r14
0x00007f5d7164dbc4 <+4>: mov %rdi,%r14
0x00007f5d7164dbc7 <+7>: push %r13
0x00007f5d7164dbc9 <+9>: push %r12
0x00007f5d7164dbcb <+11>: push %rbp
0x00007f5d7164dbcc <+12>: push %rbx
0x00007f5d7164dbcd <+13>: mov %rsi,%rbp
0x00007f5d7164dbd0 <+16>: mov %edx,%ebx
0x00007f5d7164dbd2 <+18>: sub $0x588,%rsp
0x00007f5d7164dbd9 <+25>: mov 0x1c481fc(%rip),%r15d # 0x7f5d73295ddc <vmp_native_traces_enabled>
0x00007f5d7164dbe0 <+32>: mov %r8,0x28(%rsp)
0x00007f5d7164dbe5 <+37>: test %r15d,%r15d
0x00007f5d7164dbe8 <+40>: jne 0x7f5d7164dc78 <vmp_walk_and_record_stack+184>
0x00007f5d7164dbee <+46>: test %rdi,%rdi
0x00007f5d7164dbf1 <+49>: je 0x7f5d7164dddf <vmp_walk_and_record_stack+543>
0x00007f5d7164dbf7 <+55>: cmp $0x1,%ebx
0x00007f5d7164dbfa <+58>: mov $0x2,%edx
0x00007f5d7164dbff <+63>: jg 0x7f5d7164dc30 <vmp_walk_and_record_stack+112>
0x00007f5d7164dc01 <+65>: jmpq 0x7f5d7164dddf <vmp_walk_and_record_stack+543>
0x00007f5d7164dc06 <+70>: nopw %cs:0x0(%rax,%rax,1)
0x00007f5d7164dc10 <+80>: cmp $0x3,%rax
0x00007f5d7164dc14 <+84>: je 0x7f5d7164dc58 <vmp_walk_and_record_stack+152>
0x00007f5d7164dc16 <+86>: mov (%r14),%r14
0x00007f5d7164dc19 <+89>: lea 0x2(%r15),%edx
0x00007f5d7164dc1d <+93>: test %r14,%r14
0x00007f5d7164dc20 <+96>: setne %cl
0x00007f5d7164dc23 <+99>: cmp %edx,%ebx
0x00007f5d7164dc25 <+101>: setge %al
0x00007f5d7164dc28 <+104>: test %al,%cl
0x00007f5d7164dc2a <+106>: je 0x7f5d7164dde2 <vmp_walk_and_record_stack+546>
0x00007f5d7164dc30 <+112>: mov 0x10(%r14),%rax
0x00007f5d7164dc34 <+116>: cmp $0x1,%rax
=> 0x00007f5d7164dc38 <+120>: jne 0x7f5d7164dc10 <vmp_walk_and_record_stack+80>
0x00007f5d7164dc3a <+122>: mov 0x8(%r14),%rcx
0x00007f5d7164dc3e <+126>: movslq %r15d,%rax
0x00007f5d7164dc41 <+129>: mov %edx,%r15d
0x00007f5d7164dc44 <+132>: movq $0x1,0x0(%rbp,%rax,8)
0x00007f5d7164dc4d <+141>: mov %rcx,0x8(%rbp,%rax,8)
0x00007f5d7164dc52 <+146>: jmp 0x7f5d7164dc16 <vmp_walk_and_record_stack+86>
0x00007f5d7164dc54 <+148>: nopl 0x0(%rax)
0x00007f5d7164dc58 <+152>: mov 0x8(%r14),%rax
0x00007f5d7164dc5c <+156>: movslq %r15d,%rsi
0x00007f5d7164dc5f <+159>: mov %ebx,%ecx
0x00007f5d7164dc61 <+161>: mov %rbp,%rdi
0x00007f5d7164dc64 <+164>: mov -0x8(%rax),%rdx
0x00007f5d7164dc68 <+168>: callq 0x7f5d7164d550 <vmprof_write_header_for_jit_addr>
0x00007f5d7164dc6d <+173>: mov %eax,%r15d
0x00007f5d7164dc70 <+176>: jmp 0x7f5d7164dc16 <vmp_walk_and_record_stack+86>
0x00007f5d7164dc72 <+178>: nopw 0x0(%rax,%rax,1)
0x00007f5d7164dc78 <+184>: lea 0x80(%rsp),%r15
The infinite loop is between instructions +80 and +120. I tried to understand which part of the C code it corresponds to, but failed. However, since I am using native=False, I suspect it is inside an inlined call to vmp_walk_and_record_python_stack_only.
I didn't prove it yet, but I suspect it might be related to this pypy issue: https://bitbucket.org/pypy/pypy/issues/2683/infinite-traceback-with-greenlets
continulet seems to produce an infinite python stack when you switch. So, maybe vmprof triggered at the wrong time and started to sample the infinite stack. I'll try to find a reproducible example.
Update; I tried to write a smaller reproducer; if you run this program under vmprof, it reliably segfaults inside vmp_walk_and_record_stack: not sure if it's another manifestation of the same problem or if it's unrelated though.
import itertools
import eventlet
def run(name):
for i in itertools.count():
if i % 10000 == 0:
print name, i
eventlet.sleep()
def main():
a = eventlet.spawn(run, 'a')
b = eventlet.spawn(run, 'b')
c = eventlet.spawn(run, 'c')
a.wait()
b.wait()
c.wait()
main()
Ok, found it. The problem is that if vmprof samples the stack in the middle of a switch, it simply reads nonsense: most of the time it results in a segfault, but sometimes it might end up in a infinite cycle.
If my theory is correct, it is unrelated to PyPy issue 2683.
Fixed by this PyPy commit
I think that in theory this could happen also on CPython, if vmprof triggers in the middle of a greenlet switch. However, I never managed to reproduce the crash, so for now I'm just closing the issue.
Reopening this because my fix is not complete: now it no longer segfaults but it stop taking samples after a switch(). I'll start a PyPy branch to fix it