gh-143123: Protect against recursive tracer calls/finalization
- Issue: gh-143123
Can we have a test for this?
Can we have a test for this?
The reproducer provided by the OP requires installation of a package to crash. I don't think we can unit test this if that's the case.
Is the problem that we are unable to distinguish between tracing happening in an invocation of _PyEval_EvalFrameDefault higher up the call stack and tracing happening in this invocation of _PyEval_EvalFrameDefault?
Yes I suspect that's the case.
@Fidget-Spinner We could add a "jit_depth" field to the thread state, initialized to 0.
- On entering
_PyEval_EvalFrameDefault: if jit_depth is non-zero, increase it by one. - On leaving: if jit_depth is non-zero, decrease it by one
- Don't start tracing if jit_depth is non-zero.
- When starting tracing, set it to 1.
- Tracing is active if
jit _depth == 1 - Assert that it is 1 anywhere we call the optimizer, start or stop tracing, etc.
I realised we can just do the check in _PyJit_InitializeTracing, and leave the invariants to the rest of the optimizer.
So I spent 30 minutes trying to create a simple reproducer to no avail. The following needs to happen:
- Outer Python -> C -> Inner Python calls.
- That just so happen to all be tracing at the same time.
- Outer traces, inner traces. Inner trace stomp on outer.
This is really hard to create a reproducer for as I'm not sure how to finangle the counters to be just right.
Let's not worry about a reproducer, but ensure that tracing is active only once in any thread. Then we don't need to worry about "inner" or "outer" traces, as there will only be (at most) one trace.
Let's not worry about a reproducer, but ensure that tracing is active only once in any thread. Then we don't need to worry about "inner" or "outer" traces, as there will only be (at most) one trace.
This has nothing to do with threads. It's recursive traces.
Not once across all threads. Once per thread.
Well, this PR does that. It fails if it detects another trace occuring during trace initialization.
How do you detect when to stop tracing? If an exception is raised and tracing is active, but higher up the stack, how do you know not to stop tracing?
How do you detect when to stop tracing? If an exception is raised and tracing is active, but higher up the stack, how do you know not to stop tracing?
It's protected by this https://github.com/python/cpython/blob/main/Python/optimizer.c#L720
Maybe, but the current approach seems hard to follow and fragile. I'm fairly sure we don't always stop tracing cleanly when exceptions are raised.
I think we need a clear way to express the two queries:
- "is tracing happening in this invocation of PyEval_EvalFrame?", and
- "is tracing happening in this thread?"
And add some asserts that (1) always implies (2).
We should only start tracing if (2) is false, and only stop tracing if (1) is true.
I think we already have these two queries:
- IS_JIT_TRACING()
- tstate->jit_tracer_state.initial_state.code != NULL
We should probably add a is_tracing field to the thread state for clarity instead of the very unclear tstate->jit_tracer_state.initial_state.code != NULL
@markshannon I updated to use a dedicated field to indicate tracing
@markshannon I found a repro where not stopping early during exceptions actually cause issues https://github.com/python/cpython/issues/143358.
Added code to stop early when an exception is raised.
When you're done making the requested changes, leave the comment: I have made the requested changes; please review again.