Exception AccessControlListNotSupported thrown with version 2023.9.0
Reproduce
To reproduce in a Databricks environment 12.2 LTS:
import s3fs
s3 = s3fs.S3FileSystem(anon=False)
with s3.open(output_path, 'wb') as f:
f.write(b'abc')
# --> OSError: [Errno 5] An error occurred (AccessControlListNotSupported) when calling the PutObject operation: The bucket does not allow ACLs
It works with 2023.6.0 but not with 2023.9.0.
Note:
- The bucket is configured with "Object Ownership: Bucket owner enforced".
- There is no problem when using a pure Python environment (with full control over all library versions).
Cause
It might be due to the recently released #764 (default ACL has value "private"). Any guidance how to deal with this problem (apart from downgrading)?
Installed libraries (selection)
pip freeze
aiobotocore==2.5.4
awscli==1.29.40
boto3==1.16.7
botocore==1.31.17
fsspec==2023.9.0
s3fs==2023.9.0
s3transfer==0.6.2
...
Note: Pip reports some incompatibilities among these library versions.
can you try passing acl="" (which was the previous default?
We have the same issue since yesterday but we don't use s3fs directly but with Pandas.
can you try passing
acl=""(which was the previous default?
Yes with s3.open(output_path, 'wb', acl='') as f: works.
But we use pandas, so this way does not help.
We use s3fs via pandas at a number of places (different repositories). All the places are broken now.
Could the library changed in a way so that the default (no ACL provided) works again?
Hi @rwitzel 👋🏻
Have you tried something like :
s3_options = dict(s3_additional_kwargs={"ACL": ''})
df.to_parquet ("s3://<<REDACTED>>/test_yann.parquet", storage_options=s3_options)
I have the same result, ie OSError: [Errno 5] An error occurred (AccessControlListNotSupported) when calling the PutObject operation: The bucket does not allow ACLs so I guess the my dict is not correct... 😢
EDIT : add some tests ⬇️
I made some tests with that code :
df = pd.DataFrame ({'A': []})
df.to_parquet ("s3://REDACTED/test_yann.parquet", storage_options={'s3_additional_kwargs':{'ACL': 'WrongValue'}})
- With
s3fs==2023.6.0, the exception returns :
File "/home/redacted/anaconda3/lib/python3.10/site-packages/s3fs/core.py", line 2054, in __init__
raise ValueError("ACL not in %s", key_acls)
ValueError: ('ACL not in %s', {'bucket-owner-read', 'public-read-write', 'public-read', 'authenticated-read', 'aws-exec-read', 'bucket-owner-full-control', 'private'})
- With
s3fs==2023.9.0, the exception returns
OSError: [Errno 5] An error occurred (AccessControlListNotSupported) when calling the PutObject operation: The bucket does not allow ACLs`
Thanks for the feedback. It seems like this is common enough, and a standard behaviour on AWS S3 that we must allow for it. @zanussbaum , this may mean that we need to revert your PR, and that might break your workflows. Let's see if we can come up with something that works for everyone.
For a workaround, you can provide acl=None to open() and s3_additional_kwargs={"ACL": ""} in __init__, but as noted, this is tricky to do via storage_options in pandas or elsewhere.
Does anyone know if it's possible to set up a bucket with this kind of restriction on moto?
My attempt: https://github.com/fsspec/s3fs/pull/785
I would appreciate if people on this thread could give my PR a go. I am happy to make a quick release if t clears things up.
PR #785 tested: It works again. Thank you so much, Martin!
I would appreciate if people on this thread could give my PR a go. I am happy to make a quick release if t clears things up.
This fixed my issue introduced in the most recent version where the default acl took precedence ocer my acl specified in s3_additional_kwargs. Do you know when this quick release might happen?
By Friday at the latest
I would appreciate if people on this thread could give my PR a go. I am happy to make a quick release if t clears things up.
It's ok. Thanks 👍🏻