defaultdict.__missing__ has data races
Bug report
Bug description:
Ideally, defaultdict would provide atomic "compute and add only if key is absent" behavior. When __missing__ is called, another thread might insert the key. Or, it can be inserted between the call to the factory function and the PyObject_SetItem() call.
It would help to call PyDict_SetDefaultRef() so that an existing value is not replaced. That still has the issue that the factory function is called when a value already exists for that key but that is arguably a less serious problem. We should not be replacing an existing value.
CPython versions tested on:
CPython main branch
Operating systems tested on:
Linux
Linked PRs
- gh-142569
- gh-142668
- gh-142832
Hi @nascheme ,
I'm interested on this issue. Could I have a try? 😊
Wish you a good day!
Best Regards, Edward
I'm interested on this issue. Could I have a try? 😊
Sure. Note that there was some discussion about the best way to fix this on the free-threading channel of the Discord server. Changing the defaultdict.__missing__ method to use PyDict_SetDefaultRef() seems like at least a good first step. So, if you want to make a PR to do that, it would help. It would also be useful to look at other __missing__ methods in the CPython repo and see if they have the same problem.
Hi @nascheme ,
I'm interested on this issue. Could I have a try? 😊
Wish you a good day!
Best Regards, Edward
i already make a pr
Hi @nascheme , I'm interested on this issue. Could I have a try? 😊 Wish you a good day! Best Regards, Edward
i already make a pr
Hi @fatelei ,
I find the PR was closed. Are you still working on this?
Best Regards, Edward