s3fs icon indicating copy to clipboard operation
s3fs copied to clipboard

It is not possible to use synchronous and asynchronous calls together

Open vovochka404 opened this issue 2 years ago • 3 comments

ls = fs.ls("/path")
f = await fs.open_async("/path/file1")
data = await f.read()

Will lead to:

Task <Task pending name='Task-1' coro=<main() running at /.../test.py:100> cb=[_run_until_complete_cb() at /opt/homebrew/Cellar/[email protected]/3.9. 17/Frameworks/Python.framework/Versions/3.9/lib/python3.9/asyncio/base_events.py:184]> got Future <Future pending cb=[shield.<locals>._outer_done_callback() at /opt/homebrew/ Cellar/[email protected]/3.9.17/Frameworks/Python.framework/Versions/3.9/lib/python3.9/asyncio/tasks.py:907]> attached to a different loop

The exception occurs here: aiohttp/connector.py:1145

The problem arises due to the use of the same session within which the aiohttp.TCPConnector is saved, which upon creation receives the current event_loop. But synchronous and asynchronous requests are executed in different event_loops. And if the current event_loop does not match what it was when aiohttp.TCPConnector was created, we always get an exception.

The simplest solution is to set refresh to True in s3fs.core.S3FileSystem.set_session, but it seems that this is not the best solution.

vovochka404 avatar Jan 19 '24 07:01 vovochka404

Yes, you are completely right: it is expected that the event loop is either in the same thread as the execution (in which case you use the async methods) or not (in which case you don't). The argument asynchronous= can be used to make this difference, as it will bust the caching mechanism; originally this argument made a real difference, but that difference has shrunk to nothing.

The simplest solution is to set refresh to True in s3fs.core.S3FileSystem.set_session

You should create two S3FileSystem instances, one in a coroutine for use with async, and one outside.

martindurant avatar Jan 23 '24 13:01 martindurant

May be we can have sync/async session inside one fs instance?

vovochka404 avatar Jan 25 '24 11:01 vovochka404

The session is always async, so that you can do bulk operations even in sync code. The difference is where the event loop is running, so trying to maintain multiple loops running in different threads within the one instance seems to me to be a bad idea.

martindurant avatar Jan 25 '24 14:01 martindurant