AttributeError: 'Session' object has no attribute 'create_client'
Hi. I am trying to save huggingface dataset to s3. But i am getting an error.
Here is the code
storage_options = {"key": key, "secret": secret'}
s3_session = botocore.session.Session()
storage_options = {"session": s3_session}
s3 = s3fs.S3FileSystem(**storage_options)
train_dataset.save_to_disk(train_dataset_path,fs=s3)
version
botocore: 1.27.59
aiobotocore: 2.4.0
Here is the error log
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
Input In [7], in <cell line: 5>()
3 # save train_dataset to s3
4 train_dataset_path = f's3://{sess.default_bucket()}/{s3_prefix}/train'
----> 5 train_dataset.save_to_disk(train_dataset_path,fs=s3)
7 test_dataset_path = f's3://{sess.default_bucket()}/{s3_prefix}/test'
8 test_dataset.save_to_disk(test_dataset_path,fs=s3)
File ~/venv/lib/python3.8/site-packages/datasets/arrow_dataset.py:1134, in Dataset.save_to_disk(self, dataset_path, fs)
1131 dataset_info = asdict(dataset._info)
1133 # Save dataset + state + info
-> 1134 fs.makedirs(dataset_path, exist_ok=True)
1135 with fs.open(Path(dataset_path, config.DATASET_ARROW_FILENAME).as_posix(), "wb") as dataset_file:
1136 with ArrowWriter(stream=dataset_file) as writer:
File ~/venv/lib/python3.8/site-packages/fsspec/asyn.py:111, in sync_wrapper.<locals>.wrapper(*args, **kwargs)
108 @functools.wraps(func)
109 def wrapper(*args, **kwargs):
110 self = obj or args[0]
--> 111 return sync(self.loop, func, *args, **kwargs)
File ~/venv/lib/python3.8/site-packages/fsspec/asyn.py:96, in sync(loop, func, timeout, *args, **kwargs)
94 raise FSTimeoutError from return_result
95 elif isinstance(return_result, BaseException):
---> 96 raise return_result
97 else:
98 return return_result
File ~/venv/lib/python3.8/site-packages/fsspec/asyn.py:53, in _runner(event, coro, result, timeout)
51 coro = asyncio.wait_for(coro, timeout=timeout)
52 try:
---> 53 result[0] = await coro
54 except Exception as ex:
55 result[0] = ex
File ~/venv/lib/python3.8/site-packages/s3fs/core.py:840, in S3FileSystem._makedirs(self, path, exist_ok)
838 async def _makedirs(self, path, exist_ok=False):
839 try:
--> 840 await self._mkdir(path, create_parents=True)
841 except FileExistsError:
842 if exist_ok:
File ~/venv/lib/python3.8/site-packages/s3fs/core.py:825, in S3FileSystem._mkdir(self, path, acl, create_parents, **kwargs)
821 if region_name:
822 params["CreateBucketConfiguration"] = {
823 "LocationConstraint": region_name
824 }
--> 825 await self._call_s3("create_bucket", **params)
826 self.invalidate_cache("")
827 self.invalidate_cache(bucket)
File ~/venv/lib/python3.8/site-packages/s3fs/core.py:331, in S3FileSystem._call_s3(self, method, *akwarglist, **kwargs)
330 async def _call_s3(self, method, *akwarglist, **kwargs):
--> 331 await self.set_session()
332 s3 = await self.get_s3(kwargs.get("Bucket"))
333 method = getattr(s3, method)
File ~/venv/lib/python3.8/site-packages/s3fs/core.py:507, in S3FileSystem.set_session(self, refresh, kwargs)
503 else:
504 s3creator = self.session.create_client(
505 "s3", config=conf, **init_kwargs, **client_kwargs
506 )
--> 507 self._s3 = await s3creator.__aenter__()
509 self._s3creator = s3creator
510 # the following actually closes the aiohttp connection; use of privates
511 # might break in the future, would cause exception at gc time
File ~/venv/lib/python3.8/site-packages/botocore/client.py:838, in BaseClient.__getattr__(self, item)
835 if event_response is not None:
836 return event_response
--> 838 raise AttributeError(
839 f"'{self.__class__.__name__}' object has no attribute '{item}'"
840 )
AttributeError: 'S3' object has no attribute '__aenter__'
s3fs works using aiobotocore, not botocore alone. If you wanted to provide your own session object, it would be one of the former - but why di you want to do this at all?
I've found myself trying sth similar due to particular env restrictions causing I should avoid aiobotocore. It's related to the chained dependencies botocore-aiobotocore-moto..
Apart from addressing those (probably poor managed) project restrictions, do you @martindurant think that trying to pass a botocore session could be possible or could it make sense?
I'm afraid not: the class is fundamentally asynchronous and needs async resources to work. One might make an alternative sync version, and indeed a very long time ago s3fs did use botocore directly. We have no plans to work on this.
wow, such a fast answer. That's true, I can remember passing the botocore session long time ago. Thanks for your reply, impressive your daily work on these projects.