Segfault in contextvars HAMT during isolation_scope() / new_scope() with uvicorn multiprocessing workers
How do you use Sentry?
Self-hosted/on-premise
Version
2.45.0
Steps to Reproduce
Environment
- sentry-sdk version: 2.45.0
- Python version: 3.10.x
- OS: Linux (x86_64)
- Framework: FastAPI + uvicorn with --workers 10 (multiprocessing spawn)
- Event loop: uvloop 0.21.0
Problem
Workers crash with SIGSEGV when Sentry SDK calls _current_scope.set() inside isolation_scope() or new_scope(). The crash occurs in Python's HAMT (Hash Array Mapped Trie) implementation used by contextvars.
GDB analysis shows that the HAMT node pointer has been corrupted - it points to a PyFunctionObject (_safe_repr_wrapper from sentry_sdk/serializer.py) instead of a HAMT node. When Python tries to clone this "node", it reads the function's vectorcall field as if it were b_array, then crashes trying to Py_INCREF a C function pointer.
Stack trace:
#0 _Py_INCREF (op=0x7b4d9a872260 <_PyFunction_Vectorcall>) at ./Include/object.h:472
#1 _Py_XINCREF (op=0x7b4d9a872260 <_PyFunction_Vectorcall>) at ./Include/object.h:558
#2 hamt_node_bitmap_clone (node=0x7b4c301c7e20) at Python/hamt.c:583
#3 hamt_node_bitmap_assoc (...) at Python/hamt.c:771
#4 _PyHamt_Assoc (...) at Python/hamt.c:2308
#5 contextvar_set (var=0x7b4d9916a5c0, val=<Scope at remote 0x7b4c30d62b60>) at Python/context.c:738
#6 PyContextVar_Set (ovar=<_contextvars.ContextVar at remote 0x7b4d9916a5c0>, ...) at Python/context.c:285
...
#12 _PyEval_EvalFrame (...) for file sentry_sdk/scope.py, line 1785, in isolation_scope
Python stack trace
File "sentry_sdk/scope.py", line 1785, in isolation_scope
current_token = _current_scope.set(forked_current_scope)
File "contextlib.py", line 135, in __enter__
return next(self.gen)
File "sentry_sdk/integrations/asgi.py", line 198, in _run_app
with sentry_sdk.isolation_scope() as sentry_scope:
Corrupted HAMT node:
(gdb) p *node
$2 = {
ob_base = {ob_type = 0x7b4d9aa8a7a0 <PyFunction_Type>},
b_bitmap = 2587471808,
b_array = {'_safe_repr_wrapper'} # Does not look like HAMT data
}
(gdb) p node->b_array[0]
$5 = '_safe_repr_wrapper'
(gdb) x/20x 0x7b4d9a872260
0x7b4d9a872260 <_PyFunction_Vectorcall>: ....
Sentry configuration
sentry_sdk.init(
dsn=settings.SENTRY_DSN,
traces_sample_rate=1.0,
profiles_sample_rate=0.0,
enable_tracing=True,
send_default_pii=True,
integrations=[
FastApiIntegration(),
HttpxIntegration(),
AioHttpIntegration(),
RedisIntegration(),
],
)
How to reproduce
- FastAPI app with sentry-sdk 2.45.0
- Run with uvicorn app:app --workers 10 (multiprocessing spawn)
- Send concurrent HTTP requests
- Workers eventually crash with SIGSEGV
Observations
- Crash happens in spawned worker processes, not the main process
- The corrupted pointer always points to Sentry SDK's _safe_repr_wrapper function or _PyFunction_Vectorcall
- Workers that don't use Sentry SDK (different project, same infrastructure) don't crash
- Same codebase works fine in single-process mode
We tried setting OPENBLAS_NUM_THREADS=1 and it did not help.
It appears that a function object created during Sentry serialization (_safe_repr_wrapper) is somehow being written to a location that should contain a HAMT node pointer.
This could be:
- A use-after-free where HAMT memory is reused for function objects
- A buffer overflow in serialization code
- Race condition in scope/context handling with multiprocessing
Expected Result
No segfault
Actual Result
Eventual segfault
thx for the detailed report @sysradium
unfortunately, contextvars + multiprocessing is an old issue in the python ecosystem.
The only suggestion I have for now is try moving the sentry_sdk.init to the fastapi lifespan so that each worker gets fresh state related to Sentry.
https://fastapi.tiangolo.com/advanced/events/#lifespan
from contextlib import asynccontextmanager
from fastapi import FastAPI
import sentry_sdk
@asynccontextmanager
async def lifespan(app: FastAPI):
sentry_sdk.init(..)
app = FastAPI(lifespan=lifespan)
@sl0thentr0py thanks for the explanation. I was thinking if making this change however was unsure if the fastapi integration will work correctly.
Are there any issues with initialising FastApiIntegration inside the lifespan?
Also just to mention the downgrade to 1.45.1 resolved the problem, but obviously we would love to upgrade to the latest v2 one.