vmprof-python icon indicating copy to clipboard operation
vmprof-python copied to clipboard

vmprof seems to be broken on the latest PyPy

Open antocuni opened this issue 8 years ago • 6 comments

Using the latest pypy nightly (29f3c769e610) and latest vmprof (3ffe00d), I get two different errors, randomly: About ~50% of the time, I get this:

$ pypy -m vmprof ~/pypy/default/lib-python/2.7/test/pystone.py 
Pystone(1.1) time for 50000 passes = 0.0506447
This machine benchmarks at 987271 pystones/second
Traceback (most recent call last):
  File "/home/antocuni/pypy/default/lib-python/2.7/runpy.py", line 174, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/home/antocuni/pypy/default/lib-python/2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "/home/antocuni/pypy/vmprof-python/vmprof/__main__.py", line 75, in <module>
    main()
  File "/home/antocuni/pypy/vmprof-python/vmprof/__main__.py", line 66, in main
    vmprof.disable()
  File "vmprof/__init__.py", line 38, in disable
    l.read_all()
  File "vmprof/reader.py", line 290, in read_all
    assert not marker, (fileobj.tell(), repr(marker))
AttributeError: 'FdWrapper' object has no attribute 'tell'

The remaining ~50%, I see this:

$ pypy -m vmprof ~/pypy/default/lib-python/2.7/test/pystone.py 
Pystone(1.1) time for 50000 passes = 0.051092
This machine benchmarks at 978626 pystones/second
Traceback (most recent call last):
  File "/home/antocuni/pypy/default/lib-python/2.7/runpy.py", line 174, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/home/antocuni/pypy/default/lib-python/2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "/home/antocuni/pypy/vmprof-python/vmprof/__main__.py", line 75, in <module>
    main()
  File "/home/antocuni/pypy/vmprof-python/vmprof/__main__.py", line 66, in main
    vmprof.disable()
  File "vmprof/__init__.py", line 38, in disable
    l.read_all()
  File "vmprof/reader.py", line 281, in read_all
    name = self.read_string()
  File "vmprof/reader.py", line 195, in read_string
    bytes = self.read(cnt)
  File "vmprof/reader.py", line 191, in read
    return self.fileobj.read(count)
  File "vmprof/reader.py", line 390, in read
    return os.read(self.fd, n)
MemoryError

[8] > /home/antocuni/pypy/vmprof-python/vmprof/reader.py(390)read()
-> return os.read(self.fd, n)
(Pdb++) n
8313427730393726976

As you see, the value of n is very high, and it's always different from run to run, which makes me think probably the log file is corrupt and/or in the wrong format.

antocuni avatar Jun 28 '17 15:06 antocuni

I have rewritten the part that parses the log file for native symbols in Python. Now just before disable() completes the profiling is stopped (vmprof_ignore_signals(1)) is called and the log file is read.

Looks like the file descriptor passed to PyPy still receives samples and happily writes them to the logfile (even though it should NOT). I have observed this in PDB. I seeked to pos 0 and called fd.tell(). Each time called it returns different positions.

Dont have fix yet, but that could explain the issue!

planrich avatar Jul 03 '17 00:07 planrich

uh, this sounds bad. It could also be a PyPy bug for all I know, maybe somehow the signal handler is not disabled?

antocuni avatar Jul 03 '17 09:07 antocuni

Yes it is. vmprof_ignore_signals(1) should stops to log any signal to the logfile.

planrich avatar Jul 03 '17 21:07 planrich

Ok, I got a lead. vmprof_ignore_signals' volatile variable to save its state is defined static. to expose 'stop_sampling' to pypy the whole function is copied (using #include).

The primary reason for this issue is that everything was implemented in header files...

Indeed this means that a fresh copy is compiled by gcc and calling vmprof_ignore_signal(1) has NO effect (there exist two vmprof_ignore_signals in two separate compilation uints).

Let's see if this fixes the issue.

planrich avatar Jul 04 '17 18:07 planrich

Ok all wrong. I found the issue after some digging. In PyPy calling vmprof_ignore_signals(1) has no effect. The JIT might modify codemaps (e.g. pypy_codemap_invalidate_set). This means that as soon as pypy_codemap_invlidate_set finishes all signals will occur and write to the log (even though someone else forbid it).

Changes are now in PyPy

planrich avatar Jul 25 '17 14:07 planrich

Adding a note here: I was experimenting using vmprof with non-pypy python3.5 and ran into this same symptom. In my case I tried to do a vmprof.enable() and vmprof.disable() in two parts of the code and replicated this issue in both places. However, in one case I found that the problem seemed to be alleviated by dialing back the time period in the enable() call (I had it set to 0.001 and resetting it to 0.01 fixed the problem). This was not the case in both cases, and I suspect the second case is just finishing too fast. I suppose this is a general error that happens as a response to a race condition somewhere. Unfortunately I am not knowledgeable enough to dig deeper at this time and do not feel I have enough information to add a new issue given that I can't prove it's a different cause.

matthewnorman avatar Aug 22 '17 20:08 matthewnorman