duckdb_aws icon indicating copy to clipboard operation
duckdb_aws copied to clipboard

duckdb fails to load AWS SSO credentials to connect to AWS S3

Open johnleuner-absa opened this issue 1 year ago • 8 comments

What happens?

When creating an S3 secret with PROVIDER credential_chain, duckdb is unable to load valid credentials for a profile configured with the AWS sso mechanism.

Reading from S3 generates an http 403 error.

A workaround in the python API for duckdb is to call:

aws_session = boto3.Session() creds = aws_session.get_credentials().get_frozen_credentials()

and specify the KEY_ID, SECRET and SESSION_TOKEN directly.

Similarly I can manually paste these values into the duckdb cli when creating a secret, but duckdb is unable to load these automatically.

To Reproduce

set

$env:AWS_PROFILE="myprofile"

run the duckdb cli

CREATE SECRET secret2 (
    TYPE s3,
    PROVIDER credential_chain,
    REGION 'af-south-1',
    ENDPOINT 's3.af-south-1.amazonaws.com'
);

SELECT * FROM read_parquet('s3://mybucket/mypath/**/*parquet', HIVE_PARTITIONING=1) 

OS:

windows x86_64

DuckDB Version:

v1.2.0 5f5512b827

DuckDB Client:

cli

Hardware:

No response

Full Name:

John Leuner

Affiliation:

Absa bank

What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.

I have tested with a stable release

Did you include all relevant data sets for reproducing the issue?

No - Other reason (please specify in the issue body)

Did you include all code required to reproduce the issue?

  • [x] Yes, I have

Did you include all relevant configuration (e.g., CPU architecture, Python version, Linux distribution) to reproduce the issue?

  • [x] Yes, I have

johnleuner-absa avatar Feb 27 '25 13:02 johnleuner-absa

Is this same as https://github.com/duckdb/duckdb-aws/issues/31?

paultiq avatar Feb 27 '25 18:02 paultiq

Yes I think it's very similar to

https://github.com/duckdb/duckdb-aws/issues/31

and probably also a regression of

https://github.com/duckdb/duckdb-aws/issues/14

johnleuner-absa avatar Feb 28 '25 06:02 johnleuner-absa

My testing also seemed to show that this happens not only on Windows x86-64 but also on Amazon Linux 2023

johnleuner-absa avatar Mar 13 '25 18:03 johnleuner-absa

I checked that this still fails in duckdb 1.2.1

johnleuner-absa avatar Mar 17 '25 09:03 johnleuner-absa

Another workaround for duckdb-cli is to first login with aws sso login and then export your credentials as environment variables:

aws configure export-credentials --format powershell

$Env:AWS_ACCESS_KEY_ID="ASXYZZZ" $Env:AWS_SECRET_ACCESS_KEY="oXVAAAABBBBBCCCC" $Env:AWS_SESSION_TOKEN="XXXYYYZZZ"

johnleuner-absa avatar Mar 26 '25 08:03 johnleuner-absa

I found that the trailing hash character was being treated differently by the AWS SDK library linked by duckdb and by the AWS cli tool used to perform aws sso login.

By removing the hash from this url I was able to use an SSO profile

[profile myprofile] sso_start_url = https://orgname.awsapps.com/start#

Also, I had to set the AWS_PROFILE env var to myprofile before starting duckdb.

johnleuner-absa avatar Apr 07 '25 11:04 johnleuner-absa

We are facing the sames issues when using AWS SSO.

Removing the anchor (#) within the sso_start_url did not fix the issue for us.

The http log tells us that there is not even an Authorization header attached to the request.

The workaround using aws configure export-credentials --format env works.

Phil1602 avatar May 12 '25 15:05 Phil1602

Also see my comments on the other issue

https://github.com/duckdb/duckdb-aws/issues/62#issuecomment-2839235394

johnleuner-absa avatar May 13 '25 11:05 johnleuner-absa

Another workaround for duckdb-cli is to first login with aws sso login and then export your credentials as environment variables:

aws configure export-credentials --format powershell

$Env:AWS_ACCESS_KEY_ID="ASXYZZZ" $Env:AWS_SECRET_ACCESS_KEY="oXVAAAABBBBBCCCC" $Env:AWS_SESSION_TOKEN="XXXYYYZZZ"

I am facing the same problems, and this is the only thing that worked for me: defining the envvars. Nothing else actually worked.

In my case, I am using Linux, Ubuntu 24.04 LTS.

jose-lpa avatar Nov 10 '25 09:11 jose-lpa

I don't have any issues using an SSO profile anymore, I tested with duckdb 1.4.1 on Windows x64

set AWS_PROFILE=myprofile

CREATE SECRET secret2 ( TYPE s3, PROVIDER credential_chain, REGION 'af-south-1', ENDPOINT 's3.af-south-1.amazonaws.com' );

SELECT * FROM read_parquet('s3://mybucket/mypath/**/*parquet', HIVE_PARTITIONING=1)

johnleuner-absa avatar Nov 12 '25 09:11 johnleuner-absa

Does it succeed with a secret having CHAIN set explicitly to 'sso' when reading from an attached database where that database is the Glue Iceberg REST API? I'm referring to SELECT syntax that references a fully qualified table name, as opposed to reading an S3 URI directly. I don't think this issue is completely resolved because that isn't working as of 1.4.1. SSO credentials don't work in that case. The only way such a query can work is by setting environment variables (AWS_ACCESS_KEY_ID, AWS SECRET_ACCESS_KEY, AWS_SESSION_TOKEN) which defeats the purpose of using credential chain in DuckDB Secrets.

bitadmiral avatar Dec 03 '25 19:12 bitadmiral