s3fs icon indicating copy to clipboard operation
s3fs copied to clipboard

AttributeError: 'Session' object has no attribute 'create_client'

Open talhaanwarch opened this issue 3 years ago • 4 comments

Hi. I am trying to save huggingface dataset to s3. But i am getting an error.

Here is the code

storage_options = {"key": key, "secret": secret'}  
s3_session = botocore.session.Session()
storage_options = {"session": s3_session}
s3 = s3fs.S3FileSystem(**storage_options)
train_dataset.save_to_disk(train_dataset_path,fs=s3)

version
botocore: 1.27.59
aiobotocore: 2.4.0

Here is the error log

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Input In [7], in <cell line: 5>()
      3 # save train_dataset to s3
      4 train_dataset_path = f's3://{sess.default_bucket()}/{s3_prefix}/train'
----> 5 train_dataset.save_to_disk(train_dataset_path,fs=s3)
      7 test_dataset_path = f's3://{sess.default_bucket()}/{s3_prefix}/test'
      8 test_dataset.save_to_disk(test_dataset_path,fs=s3)

File ~/venv/lib/python3.8/site-packages/datasets/arrow_dataset.py:1134, in Dataset.save_to_disk(self, dataset_path, fs)
   1131 dataset_info = asdict(dataset._info)
   1133 # Save dataset + state + info
-> 1134 fs.makedirs(dataset_path, exist_ok=True)
   1135 with fs.open(Path(dataset_path, config.DATASET_ARROW_FILENAME).as_posix(), "wb") as dataset_file:
   1136     with ArrowWriter(stream=dataset_file) as writer:

File ~/venv/lib/python3.8/site-packages/fsspec/asyn.py:111, in sync_wrapper.<locals>.wrapper(*args, **kwargs)
    108 @functools.wraps(func)
    109 def wrapper(*args, **kwargs):
    110     self = obj or args[0]
--> 111     return sync(self.loop, func, *args, **kwargs)

File ~/venv/lib/python3.8/site-packages/fsspec/asyn.py:96, in sync(loop, func, timeout, *args, **kwargs)
     94     raise FSTimeoutError from return_result
     95 elif isinstance(return_result, BaseException):
---> 96     raise return_result
     97 else:
     98     return return_result

File ~/venv/lib/python3.8/site-packages/fsspec/asyn.py:53, in _runner(event, coro, result, timeout)
     51     coro = asyncio.wait_for(coro, timeout=timeout)
     52 try:
---> 53     result[0] = await coro
     54 except Exception as ex:
     55     result[0] = ex

File ~/venv/lib/python3.8/site-packages/s3fs/core.py:840, in S3FileSystem._makedirs(self, path, exist_ok)
    838 async def _makedirs(self, path, exist_ok=False):
    839     try:
--> 840         await self._mkdir(path, create_parents=True)
    841     except FileExistsError:
    842         if exist_ok:

File ~/venv/lib/python3.8/site-packages/s3fs/core.py:825, in S3FileSystem._mkdir(self, path, acl, create_parents, **kwargs)
    821 if region_name:
    822     params["CreateBucketConfiguration"] = {
    823         "LocationConstraint": region_name
    824     }
--> 825 await self._call_s3("create_bucket", **params)
    826 self.invalidate_cache("")
    827 self.invalidate_cache(bucket)

File ~/venv/lib/python3.8/site-packages/s3fs/core.py:331, in S3FileSystem._call_s3(self, method, *akwarglist, **kwargs)
    330 async def _call_s3(self, method, *akwarglist, **kwargs):
--> 331     await self.set_session()
    332     s3 = await self.get_s3(kwargs.get("Bucket"))
    333     method = getattr(s3, method)

File ~/venv/lib/python3.8/site-packages/s3fs/core.py:507, in S3FileSystem.set_session(self, refresh, kwargs)
    503 else:
    504     s3creator = self.session.create_client(
    505         "s3", config=conf, **init_kwargs, **client_kwargs
    506     )
--> 507     self._s3 = await s3creator.__aenter__()
    509 self._s3creator = s3creator
    510 # the following actually closes the aiohttp connection; use of privates
    511 # might break in the future, would cause exception at gc time

File ~/venv/lib/python3.8/site-packages/botocore/client.py:838, in BaseClient.__getattr__(self, item)
    835 if event_response is not None:
    836     return event_response
--> 838 raise AttributeError(
    839     f"'{self.__class__.__name__}' object has no attribute '{item}'"
    840 )

AttributeError: 'S3' object has no attribute '__aenter__'

talhaanwarch avatar Nov 05 '22 14:11 talhaanwarch

s3fs works using aiobotocore, not botocore alone. If you wanted to provide your own session object, it would be one of the former - but why di you want to do this at all?

martindurant avatar Nov 05 '22 16:11 martindurant

I've found myself trying sth similar due to particular env restrictions causing I should avoid aiobotocore. It's related to the chained dependencies botocore-aiobotocore-moto..

Apart from addressing those (probably poor managed) project restrictions, do you @martindurant think that trying to pass a botocore session could be possible or could it make sense?

CarlosVecina avatar Nov 21 '22 15:11 CarlosVecina

I'm afraid not: the class is fundamentally asynchronous and needs async resources to work. One might make an alternative sync version, and indeed a very long time ago s3fs did use botocore directly. We have no plans to work on this.

martindurant avatar Nov 21 '22 15:11 martindurant

wow, such a fast answer. That's true, I can remember passing the botocore session long time ago. Thanks for your reply, impressive your daily work on these projects.

CarlosVecina avatar Nov 21 '22 16:11 CarlosVecina