aiohttp.GunicornWebWorker: Unknown child process pid xxx, will report returncode 255
Describe the bug
The aiohttp.GunicornWebWorker leads to problems executing/waiting for subprocesses correctly when using asyncio, leading to the wrong exit code being set on subprocesses created through asyncio.create_subprocess_exec
This might be related to https://github.com/python/cpython/issues/87744
This problem does not arise when using gunicorn with a different worker class, e.g. gunicorn --worker-class sync example
It also does not arise when using python directly, e.g. python example
The attached reproduction script is just a simple example how to trigger this error, in our server this occurs with other subprocesses which are created in response to HTTP requests coming in through aiohttp, after the booting of the worker
If necessary, I can provide another example where an HTTP request triggers this behavior in a running app.
To Reproduce
- Create a simple python script
import asyncio
import signal
async def main():
while True:
proc = await asyncio.create_subprocess_exec('sleep', '0.1')
await asyncio.sleep(0.1)
try:
proc.send_signal(signal.SIGUSR1)
except ProcessLookupError:
pass
assert (await proc.wait() != 255)
asyncio.run(main())
-
Execute the python script with
gunicornand waitgunicorn --worker-class aiohttp.GunicornWebWorker example -
The worker encounters an
asynciowarningUnknown child process pid xxx, will report returncode 255 -
Any code that relies on a correct exit code will behave unexpectedly
Expected behavior
The subprocess should reliably finish with exit code 0
Logs/tracebacks
(pinta-VTQpJWDy-py3.8) [f.elsner@battlestation pinta-backend]$ gunicorn --worker-class aiohttp.GunicornWebWorker pinta.backend.experiment
[2022-10-21 14:04:12 +0200] [2443987] [INFO] Starting gunicorn 20.1.0
[2022-10-21 14:04:12 +0200] [2443987] [INFO] Listening at: http://127.0.0.1:8000 (2443987)
[2022-10-21 14:04:12 +0200] [2443987] [INFO] Using worker: aiohttp.GunicornWebWorker
[2022-10-21 14:04:12 +0200] [2443989] [INFO] Booting worker with pid: 2443989
Unknown child process pid 2444038, will report returncode 255
[2022-10-21 14:04:13 +0200] [2443989] [ERROR] Exception in worker process
Traceback (most recent call last):
File "/home/f.elsner/.cache/pypoetry/virtualenvs/pinta-VTQpJWDy-py3.8/lib/python3.8/site-packages/gunicorn/arbiter.py", line 589, in spawn_worker
worker.init_process()
File "/home/f.elsner/.cache/pypoetry/virtualenvs/pinta-VTQpJWDy-py3.8/lib64/python3.8/site-packages/aiohttp/worker.py", line 51, in init_process
super().init_process()
File "/home/f.elsner/.cache/pypoetry/virtualenvs/pinta-VTQpJWDy-py3.8/lib/python3.8/site-packages/gunicorn/workers/base.py", line 134, in init_process
self.load_wsgi()
File "/home/f.elsner/.cache/pypoetry/virtualenvs/pinta-VTQpJWDy-py3.8/lib/python3.8/site-packages/gunicorn/workers/base.py", line 146, in load_wsgi
self.wsgi = self.app.wsgi()
File "/home/f.elsner/.cache/pypoetry/virtualenvs/pinta-VTQpJWDy-py3.8/lib/python3.8/site-packages/gunicorn/app/base.py", line 67, in wsgi
self.callable = self.load()
File "/home/f.elsner/.cache/pypoetry/virtualenvs/pinta-VTQpJWDy-py3.8/lib/python3.8/site-packages/gunicorn/app/wsgiapp.py", line 58, in load
return self.load_wsgiapp()
File "/home/f.elsner/.cache/pypoetry/virtualenvs/pinta-VTQpJWDy-py3.8/lib/python3.8/site-packages/gunicorn/app/wsgiapp.py", line 48, in load_wsgiapp
return util.import_app(self.app_uri)
File "/home/f.elsner/.cache/pypoetry/virtualenvs/pinta-VTQpJWDy-py3.8/lib/python3.8/site-packages/gunicorn/util.py", line 359, in import_app
mod = importlib.import_module(module)
File "/usr/lib64/python3.8/importlib/__init__.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
File "<frozen importlib._bootstrap>", line 991, in _find_and_load
File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 671, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 843, in exec_module
File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
File "/home/f.elsner/coding/a4/pinta-backend/pinta/backend/experiment.py", line 40, in <module>
asyncio.run(main())
File "/usr/lib64/python3.8/asyncio/runners.py", line 44, in run
return loop.run_until_complete(main)
File "/usr/lib64/python3.8/asyncio/base_events.py", line 616, in run_until_complete
return future.result()
File "/home/f.elsner/coding/a4/pinta-backend/pinta/backend/experiment.py", line 38, in main
assert (await proc.wait() != 255)
AssertionError
[2022-10-21 14:04:13 +0200] [2443989] [INFO] Worker exiting (pid: 2443989)
[2022-10-21 14:04:13 +0200] [2443987] [INFO] Shutting down: Master
[2022-10-21 14:04:13 +0200] [2443987] [INFO] Reason: Worker failed to boot.
Python Version
$ python --version
> Python 3.8.14
aiohttp Version
$ python -m pip show aiohttp
> Version: 3.8.3
multidict Version
$ python -m pip show multidict
> Version: 6.0.2
yarl Version
$ python -m pip show yarl
> Version: 1.7.2
OS
$ cat /etc/fedora-release
Fedora release 36 (Thirty Six)
$ uname -a
Linux battlestation 5.19.14-200.fc36.x86_64 #1 SMP PREEMPT_DYNAMIC Wed Oct 5 21:31:17 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Related component
Server
Additional context
No response
Code of Conduct
- [X] I agree to follow the aio-libs Code of Conduct
import asyncio
import signal
import logging
# Set up logging to capture warnings and errors
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
async def run_subprocess():
"""Run a subprocess and handle its lifecycle carefully."""
try:
# Create subprocess with proper cleanup
proc = await asyncio.create_subprocess_exec(
'sleep', '0.1',
stdout=asyncio.subprocess.PIPE,
stderr=asyncio.subprocess.PIPE
)
logger.info(f"Started subprocess with PID: {proc.pid}")
# Wait briefly to allow the process to start
await asyncio.sleep(0.1)
try:
# Send SIGUSR1 signal to the process
proc.send_signal(signal.SIGUSR1)
except ProcessLookupError:
logger.warning(f"Process {proc.pid} already exited")
pass
# Wait for the process to complete and get the return code
return_code = await proc.wait()
logger.info(f"Subprocess {proc.pid} exited with return code: {return_code}")
# Assert that the return code is not 255
assert return_code != 255, f"Unexpected return code 255 for process {proc.pid}"
return return_code
except Exception as e:
logger.error(f"Error in subprocess: {e}")
raise
async def main():
"""Main loop to run subprocesses continuously."""
while True:
try:
await run_subprocess()
except AssertionError as e:
logger.error(f"Assertion error: {e}")
break
except Exception as e:
logger.error(f"Unexpected error: {e}")
break
# Small delay to prevent tight loop
await asyncio.sleep(0.1)
def create_app():
"""Create an aiohttp application for Gunicorn."""
from aiohttp import web
app = web.Application()
# Add a simple route to keep the server running
async def health_check(request):
return web.Response(text="OK")
app.router.add_get('/', health_check)
return app
if __name__ == '__main__':
# For direct execution (not via Gunicorn)
asyncio.run(main())
else:
# For Gunicorn with aiohttp.GunicornWebWorker
app = create_app()