s3fs icon indicating copy to clipboard operation
s3fs copied to clipboard

Exception AccessControlListNotSupported thrown with version 2023.9.0

Open rwitzel opened this issue 2 years ago • 12 comments

Reproduce

To reproduce in a Databricks environment 12.2 LTS:

import s3fs
s3 = s3fs.S3FileSystem(anon=False)
with s3.open(output_path, 'wb') as f:
    f.write(b'abc')
# --> OSError: [Errno 5] An error occurred (AccessControlListNotSupported) when calling the PutObject operation: The bucket does not allow ACLs

It works with 2023.6.0 but not with 2023.9.0.

Note:

  • The bucket is configured with "Object Ownership: Bucket owner enforced".
  • There is no problem when using a pure Python environment (with full control over all library versions).

Cause

It might be due to the recently released #764 (default ACL has value "private"). Any guidance how to deal with this problem (apart from downgrading)?

Installed libraries (selection)

pip freeze
aiobotocore==2.5.4
awscli==1.29.40
boto3==1.16.7
botocore==1.31.17
fsspec==2023.9.0
s3fs==2023.9.0
s3transfer==0.6.2
...

Note: Pip reports some incompatibilities among these library versions.

rwitzel avatar Sep 04 '23 13:09 rwitzel

can you try passing acl="" (which was the previous default?

martindurant avatar Sep 04 '23 13:09 martindurant

We have the same issue since yesterday but we don't use s3fs directly but with Pandas.

yilas avatar Sep 04 '23 14:09 yilas

can you try passing acl="" (which was the previous default?

Yes with s3.open(output_path, 'wb', acl='') as f: works.

But we use pandas, so this way does not help.

We use s3fs via pandas at a number of places (different repositories). All the places are broken now.

Could the library changed in a way so that the default (no ACL provided) works again?

rwitzel avatar Sep 04 '23 15:09 rwitzel

Hi @rwitzel 👋🏻

Have you tried something like :

s3_options = dict(s3_additional_kwargs={"ACL": ''})
df.to_parquet ("s3://<<REDACTED>>/test_yann.parquet", storage_options=s3_options)

I have the same result, ie OSError: [Errno 5] An error occurred (AccessControlListNotSupported) when calling the PutObject operation: The bucket does not allow ACLs so I guess the my dict is not correct... 😢


EDIT : add some tests ⬇️

I made some tests with that code :

df = pd.DataFrame ({'A': []})
df.to_parquet ("s3://REDACTED/test_yann.parquet", storage_options={'s3_additional_kwargs':{'ACL': 'WrongValue'}})
  • With s3fs==2023.6.0, the exception returns :
  File "/home/redacted/anaconda3/lib/python3.10/site-packages/s3fs/core.py", line 2054, in __init__
    raise ValueError("ACL not in %s", key_acls)
ValueError: ('ACL not in %s', {'bucket-owner-read', 'public-read-write', 'public-read', 'authenticated-read', 'aws-exec-read', 'bucket-owner-full-control', 'private'})
  • With s3fs==2023.9.0, the exception returns
OSError: [Errno 5] An error occurred (AccessControlListNotSupported) when calling the PutObject operation: The bucket does not allow ACLs`

yilas avatar Sep 05 '23 13:09 yilas

Thanks for the feedback. It seems like this is common enough, and a standard behaviour on AWS S3 that we must allow for it. @zanussbaum , this may mean that we need to revert your PR, and that might break your workflows. Let's see if we can come up with something that works for everyone.

For a workaround, you can provide acl=None to open() and s3_additional_kwargs={"ACL": ""} in __init__, but as noted, this is tricky to do via storage_options in pandas or elsewhere.

martindurant avatar Sep 05 '23 14:09 martindurant

Does anyone know if it's possible to set up a bucket with this kind of restriction on moto?

martindurant avatar Sep 05 '23 14:09 martindurant

My attempt: https://github.com/fsspec/s3fs/pull/785

martindurant avatar Sep 05 '23 14:09 martindurant

I would appreciate if people on this thread could give my PR a go. I am happy to make a quick release if t clears things up.

martindurant avatar Sep 05 '23 17:09 martindurant

PR #785 tested: It works again. Thank you so much, Martin!

rwitzel avatar Sep 05 '23 19:09 rwitzel

I would appreciate if people on this thread could give my PR a go. I am happy to make a quick release if t clears things up.

This fixed my issue introduced in the most recent version where the default acl took precedence ocer my acl specified in s3_additional_kwargs. Do you know when this quick release might happen?

ArtnerC avatar Sep 12 '23 20:09 ArtnerC

By Friday at the latest

martindurant avatar Sep 12 '23 20:09 martindurant

I would appreciate if people on this thread could give my PR a go. I am happy to make a quick release if t clears things up.

It's ok. Thanks 👍🏻

yilas avatar Sep 14 '23 05:09 yilas