s5cmd icon indicating copy to clipboard operation
s5cmd copied to clipboard

storage/s3: `endpoint_url` from AWS config file

Open jsproede opened this issue 2 years ago • 13 comments

Hi 👋

We've used this tool a lot since we've discovered s5cmd for our S3 buckets. As we're not using AWS, we need to set the S3_ENDPOINT_URL each time we open a new Terminal.

As the AWS-CLI officially supports endpoint_url in the AWS config file for a profile, I thought it would be great if s5cmd would support endpoint_url in AWS config profiles as well. Unfortunately the AWS-SDK does not support endpoint_url, therefore we're using the go-ini module to read the AWS config file.

Therefore we extended the newSession function to use either the configured AWS config file (via AWS_CONFIG_FILE env-variable) or the default AWS config file ($HOME/.aws/config) and lookup/use the endpoint_url for the specified profile (only if configured and not specified by $S3_ENDPOINT_URL or the --endpoint-url-flag).

Additionally we created a test which covers the config file including a endpoint_url and a profile which has not set the endpoint_url.

What do you think about this?

If there is anything missing or we need to change something in this pull request, feel free to let us know 🙂

jsproede avatar Jul 14 '23 06:07 jsproede

I forgot, that AWS config profiles must be prefixed with profile. After pushing my changes the "ci / test (windows/go-1.17.x) (pull_request)" is now failing 🤔 I'll mark my PR as ready again. Can someone look into it (maybe restart the failing job)?

jsproede avatar Jul 14 '23 07:07 jsproede

What about the "region" and "ca_bundle" in the config file as well? It appears these aren't being consumed by s5cmd either. Is there a more complete way to bring in all the setting in the config file and pass those to the SDK?

tilimil avatar Sep 20 '23 18:09 tilimil

Thanks for the PR, looking forward for it being merged. Thanks to the reviewers in advance!

yakimant avatar Oct 05 '23 09:10 yakimant

What about the "region" and "ca_bundle" in the config file as well? It appears these aren't being consumed by s5cmd either. Is there a more complete way to bring in all the setting in the config file and pass those to the SDK?

@tilimil Great suggestion. We've mainly focused on the endpoint-URL. I'm not sure, if the SDK could properly handle region and ca_bundle itself and therefore go-ini would not be necessary for these two configuration properties. I need to take a deeper look into the code of s5cmd to properly answer that. Maybe one of the maintainers can help us with that? :)

jsproede avatar Oct 06 '23 06:10 jsproede

Hi! It's nice feature to have

MrWaip avatar Oct 18 '23 14:10 MrWaip

Needs rebase

k0ste avatar Feb 19 '24 16:02 k0ste

@k0ste Since the PR has now been open since July 2023, I wanted to save myself the time of working on a regular rebase as long as there is no feedback, because I don't even know yet whether the PR would be accepted at all.

jsproede avatar Feb 20 '24 06:02 jsproede

Any updates on this? Looking forward for this feature

Muhamob avatar May 27 '24 07:05 Muhamob

Any updates on this?

OS-M avatar Jun 24 '24 17:06 OS-M

Any progress on this? AWS CLI ~/.aws/config file with the [profile USER] section containing the endpoint_url =, or region = are very common...

valerytschopp avatar Jun 28 '24 13:06 valerytschopp

I did not receive any response yet 😕

jsproede avatar Jun 28 '24 14:06 jsproede

Finally I made a quick fork of s5cmd to fix this because it's so annoying to use --endpoint-url flag for our team (we're building yet another S3-compatible storage :/ ): https://github.com/OS-M/s5cmd.

You can install it with go install github.com/OS-M/s5cmd/[email protected]. However I give no guarantee of this working fine for everyone because my fix is too easy and straightforward.

The known issue is that my fork searches for endpoint only in credentials file, but not in config. However you can safely move all your settings to credentials file (which is even more useful imho)

OS-M avatar Jun 28 '24 14:06 OS-M

Concerning this PR: when implementing my own fix I ran in a behaviour of s3-clients creation on every query because I was modifying it in the newSession method so the config was unmodified outside this method and the sessions cache won't work. Then I moved the modification to the NewStorageOpts method so the sessions cache works fine. Here is the commit with the fix: https://github.com/OS-M/s5cmd/commit/178bece8e02255042a82649360e50abf0028a366.

I seems like this cache issue affects this PR too.

OS-M avatar Jun 28 '24 14:06 OS-M

  1. @jsproede thanks a ton for this PR. I also really like s5cmd and this PR really fixes one of the most annoying things for people that use non-AWS S3 on a regular basis! I'd really appreciate if you would rebase this PR and we could get this over the finish line.

  2. And if you don't mind me asking ... there also is the capability to set service specific endpoints in the AWS SDK which can then be used to reference this definition in multiple profiles:

[profile dev1]
output = json
services = testing-s3

[profile dev2]
output = text
services = testing-s3

[services testing-s3]
s3 = 
  endpoint_url = https://localhost:4567

Any chance you might add that functionality as well?

@jsproede I was also wondering if it makes sense to ini-parse the AWS config / credentials files yourself, rather than using their own library (https://pkg.go.dev/github.com/aws/aws-sdk-go-v2/config) to parse and collect all of relevant configuration sources?

frittentheke avatar Jul 23 '24 12:07 frittentheke

@ilkinulas @seruman @denizsurmeli would you consider accepting the functionality this PR is trying to add by any chance then?

frittentheke avatar Jul 23 '24 12:07 frittentheke

Hi @frittentheke,

Sorry for late reply. Personally I'm all against reading the shared config/credentials file. I think s5cmd should leave all file/env var reading to AWS SDK code and provide necessary flags to configure it, like --endpoint-url , --source-region etc.

With reading config file by ourselves, we kinda mess with SDK's discovery mechanism -IMHO-, like;

; cat ~/.aws/credentials

[some-profile]
aws_access_key_id =  dummy
aws_secret_access_key = dummy
region = us-east-1

; cat ~/.aws/config

[profile some-profile]
endpoint_url = http://localhost:4566

; AWS_PROFILE=some-profile ./s5cmd -r 0 ls
ERROR "ls": InvalidAccessKeyId: The AWS Access Key Id you provided does not exist in our records. status code: 403, request id: T53PBF24JEMYVBQP, host id: WSjAFfmmffVRXTqh0Gfu9mdjeuqBJAh2XNhFxzJApMO4UpH4lAn5Xkt+xlq4jIOm0Lw7k2RjK0A=

This hits the real AWS, AWS_PROFILE environment variable seems to be unused. With --profile flag it works.

; ./s5cmd --profile some-profile ls ; echo $?
0

Instead of rolling our own solution, I think we should aim for SDK upgrade; seems like SDK for Go v2 supports it; https://docs.aws.amazon.com/sdkref/latest/guide/feature-ss-endpoints.html#ss-endpoints-sdk-compat

It's been long since we last tried but it was no avail due to breaking of GCS support; https://github.com/peak/s5cmd/pull/478#issuecomment-1249311139

seruman avatar Jul 30 '24 07:07 seruman

This PR has been open for over a year now, to be honest until this week without any response from the maintainers, although the desire to merge this PR has been expressed several times by others. When I implemented the changes with go-ini, endpoint_url was not supported by the AWS SDK for Go in AWS profiles. At that time, we worked relatively little with AWS, but with AWS-compatible services from third-party providers and it was annoying to have to set the endpoint_url again and again. We now mainly work with AWS services, so we are no longer really affected by the problem.

If this is now supported by V2, I would also prefer an SDK upgrade, but that is not part of this PR. After all this time, the message from one of the maintainers (which is not even addressed to the me, the PR creator) seems like this PR will not be approved anyway, so I will close this PR now.

jsproede avatar Jul 31 '24 05:07 jsproede

@ilkinulas @seruman @denizsurmeli I honestly understand @jsproede's frustration and am sad to see this PR go to waste.

s5cmd is really an awesome tool. But as an open-source tool it needs to evolve, be maintained and in that be open to feedback to bugs and lack of features from the community, including patches to address those. I urge you to take an afternoon to plow through the open PRs (there are "only" 13 at the point of writing this) and issues. Some quite certain have been addressed already, others might contain valuable improvements or bug reports.

Back to topic ... Would you potentially pick up on the SDK upgrade approach yourself then? The issue of some half-hearted, pick an choose use of the AWS-like config methods still stands. The otherwise always appearing unexpected side effects such as --cmd-line-parameters vs. ENV-vars vs. ~/.aws/config files really render ANY other solution compared to using AWS' own config library pointless.

frittentheke avatar Jul 31 '24 06:07 frittentheke

After all this time, the message from one of the maintainers (which is not even addressed to the me, the PR creator) seems like this PR will not be approved anyway, so I will close this PR now.

@jsproede, I'm really sorry about this. I should have either given a proper review on time or mentioned you in the message above. I didn't mean to be disrespectful.

You didn't have to close the PR. I didn't mean to imply that this PR would not be approved. I was merely stating my opinion on the matter. As almost all maintainers subscribed alongside the reviewers, they might have different opinions on the matter.

seruman avatar Jul 31 '24 15:07 seruman

Any updates on this?

zalsader avatar Oct 03 '24 17:10 zalsader