gh-114940: Use fine-grained mutex protection for `PyInterpreterState.threads`
- Issue: gh-114940
This shouldn't have a NEWS entry--it's not a user-facing change. (The interpreter state structure is not public.)
While we're here, it might be good to replace the uses of HEAD_LOCK for locking a single interpreter, with INTERP_THREAD_LOCK. (In that case, this does need a news entry, as thread safety is technically user-facing--ping me to remove the skip news label if you take that approach.)
I chose to remove NEWS, and HEAD_LOCK will not be completely replaced. It depends on @ericsnowcurrently plans (if it wants to make the interpreter thread use INTERP_THREAD_LOCK instead of HEAD_LOCK for everything, I will replace all uses of HEAD_LOCK in the entire code base
HEAD_LOCK will not be completely replaced
Yeah, I know. We just need to replace it in the PyInterpreterState* (and possibly _PyXI*?) functions.
Currently I have replaced gc_free_threading.c, ceval.c, ceval_gil.c, instrumentation.c.
Edit:
All HEAD_LOCK, But I'm not sure whether to replace them. In my opinion, these files actually use locks just to protect the current process, but they lock the entire interpreter process.
Hmm, that doesn't seem right. I would expect most of the changes to be in pystate.c. I guess there could be others, but I'll let Eric weigh in on that. (I'm still learning about how isolated subinterpreters work.)
I'm actually not sure since this only seems to be related to pystate.c
Sorry to ping you directly @ericsnowcurrently, a lot of the stuff in gc_free_threading seems to be about cleaning up the current interpreter state, I changed them to lock the current interpreter, Is this correct?
Cleaning up an interpreter state (assuming you mean PyInterpreterState_Clear and PyInterpreterState_Delete) should be thread-safe, multiple threads won't be accessing something while it's getting deallocated.
Sorry, I seem to have described it wrong. They are all traversing the current state, not cleaning. Please forgive my wrong words.
And thanks for taking the time to work on this!
Thank you for your review! The statements I modified have a common feature, that is, they run in the current interpreter state and when traversing interp.threads.
I've marked this as a draft in the meantime. I'm guessing this will end up being a very large PR, depending on where we need to change things.
@ZeroIntensity Do you have time to label this RP?
For example, the interpreter-core and the subinterpreter. Because it will attract more people to judge this problem.
Maybe this PR can be opened, thanks to @kumaraditya303, @ericsnowcurrently for reviewing it. Opening this PR is to allow more people to review and make sure the changes are correct (although I have checked and most of them are fine).
Link this issue to this PR: https://github.com/python/cpython/issues/114016
cc @colesbury @vstinner
In fact, my current plan is to add mutex locks to all operations related to thread members. In fact, this is the meaning of its existence. But I'm thinking, as you said, maybe we should protect other functions worth adding locks.
A few notes here.
- I think this needs a NEWS entry now, because this is no longer internal. Something like "Improved performance with runtime thread contention" could work.
- With that being said, RUANG, I suggest you put extra focus on listening to advice here--this is a complicated issue you're addressing, and your previous history with PRs hasn't gone well.
- Please @ mention me when you believe this is fully ready, so I can run the buildbot fleet on it.
Yes, for this PR, I will be very focused and listen to all the advice. Because this change is very important, in fact, I have been evaluating the status of multithreading and interpreter recently. I hope this PR can be audited by more people. Thank you for your good advice.
Link this issue to this PR:
https://github.com/python/cpython/issues/125908
It describes the check when handling PyThreadState_Clear to make sure that HEAD_LOCK is not held before the call.·
And, CI warnings can be ignored, they come from (runtime not used)
#ifdef Py_GIL_DISABLED
assert(runtime->stoptheworld.world_stopped);
#endif
I replaced the lock of PyOS_BeforeFork/PyOS_AfterFork_Parent, and it turns out that this seems to be unfeasible. They will be deadlocked when running test_fork and test_subprocess.
(At least my evaluation results in ubuntu are as mentioned above).
In the interest of understanding what actually needs to be switched to the new lock, I ended up implementing this change separately: gh-127037. I'd be fine if we updated this PR to merge in my changes, since they're very similar and I'd be glad for you (@rruuaanng) to get credit. 😄
@ericsnowcurrently Oh, at a glance, I seem to think they are the same, but they are not consistent. This is very helpful to me. If we wait until the merger, maybe this PR can be merged with it.
Edit
They seem to be the same :(
I think just adding Co-authored-by on Eric's PR is the way to go. James'/RUANG's history of PRs isn't great, so I don't fully trust him to merge the changes correctly (especially with the previous force push fiasco on the mind), or to correctly address reviews on this large of a change. Sorry :(
If it's okay for me to be a co-signed, I think it's good 😀
I ended up splitting up the changes into gh-127077 and gh-127115, and credited @rruuaanng on the latter one. Let's close this PR.