s5cmd icon indicating copy to clipboard operation
s5cmd copied to clipboard

How to delete object versions?

Open iainelder opened this issue 4 years ago • 2 comments

I want to delete a non-empty bucket that has versioning enabled.

To do that in the S3 console first I would

  1. use the "Empty Bucket" option
  2. use the "Delete Bucket" option

I see that to cover step 2 s5cmd has an rb command.

To cover step 1, I tried to use the rm command with syntax like this: s3://bucketname/*

It placed a delete marker on all the objects in the bucket, but it didn't permanently delete any verisons.

So S3 still considers the bucket non-empty, and so step 2 fails.

$ s5cmd rm s3://bucketname/*
rm s3://bucketname/...
...

$ s5cmd rb s3://bucketname
ERROR "rb s3://bucketname": BucketNotEmpty: The bucket you tried to delete is not empty. You must delete all versions in the bucket. status code: 409, request id: XXXXXXXX, host id: XXXXXXX

I don't see an option for permanent deletion in the rm command help.

Am I missing something?

I'm using version v1.4.0-d7a0dda.

iainelder avatar Dec 01 '21 19:12 iainelder

s5cmd doesn't have version support currently.

igungor avatar Feb 18 '22 07:02 igungor

You're right, s5cmd does not support versioning-enabled buckets. The simplest way to do it seems to be using s3 console (as you said) or using alternatives such as s3wipe.

AWS CLI does not support emptying versioning-enabled buckets or deleting versioning-enabled buckets, though it migt be achieved by using aws s3api.

Adding versioning support to s5cmd

Few notes

  • There is no direct way to delete all versions of an object & its delete markers in the aws-sdk. So, we should search for all versions of objects using ListObjectVersions method (or its derivatives), then delete them using both their Keys and VersionIDs.

  • Version IDs are generated by AWS, cannot be edited or set.

  • Version IDs are Unicode, UTF-8 strings.

  • There is no order relation between version IDs of an object.

Review of various CLI tools

Minio CLI

Rich support

  • It has version subcommand to manage bucket versioning.
  • Its ls, du, rm subcommand supports use of versions flag to refer all versions of objects
  • Its cat, cp, rm subcommand supports use of version-id flag to specify the version of object
  • Its rb subcommand has forceflag to "remove ... all object versions"
  • Its subcommands have rewind, older-than, and newer-than flags to filter objects (and versions if combined with versions flag) with time intervals

s4cmd

few or no support

  • It only has API-VersionId to specify versionID of object

aws s3

few or no support

  • Does not support versioning, it suggests using s3api "for for the low level S3 commands for the CLI".

gsutil

Medium to rich support, though has "generation number" instead of "version id s"

  • It has versioning subcommand to manage bucket versioning.

  • Its ls, du, rm subcommand supports use of a flag to refer all versions of object(s)

  • Its cp subcommand supports use of A flag to refer all versions of object(s)

  • Its rb subcommand does not have an option to empty bucket.

  • To specify a particular version of object generation number can be appended at the end of object key, seperated with a hash mark s

    • e.g gsutil cp gs://BUCKET_NAME/OBJECT_NAME#GENERATION_NUMBER gs://BUCKET_NAME/OBJECT_NAME
  • gcloud alpha storage also support those operations with all-versions flag.

Proposal

Note Refer to #475 for further discussion & updates

  • add all-versions flag to following subcommands:
  • [x] ls
  • [ ] rm
  • [x] du
  • add version-id flag to following sub commands:
  • [ ] cp
  • [x] cat
  • [x] rm
  • [x] du
  • @peak/big-data what do you think? To what extent s5cmd should support versioning?

kucukaslan avatar Jul 06 '22 10:07 kucukaslan

@igungor @Kucukaslan , when can we expect this feature to be released?

This issue has been closed as completed, but I still can't delete object versions with the latest release.

$ s5cmd version
v2.1.0-beta.1-3e08061
$ s5cmd ls --all-versions "s3://${bucket}"
Incorrect Usage: flag provided but not defined: -all-versions
...

I use this setup to test it in my own account.

bucket=$"isme-$(gpw 1 8)"

aws s3api create-bucket \
--bucket "$bucket"

aws s3api put-bucket-versioning \
--bucket "$bucket" \
--versioning-configuration Status=Enabled

printf "key" > /tmp/key

aws s3api put-object \
--bucket "$bucket" \
--key key \
--body /tmp/key

iainelder avatar Jun 16 '23 10:06 iainelder

This issue has been closed as completed, but I still can't delete object versions with the latest release.

Sorry for the confusion. It was closed since the relevant PR was merged to master where development happens. But the new release including that PR isn't out yet (for the moment you may try building/installing from master). Unfortunately, I don't know when it will be released; but hope it to be soon.

kucukaslan avatar Jun 16 '23 23:06 kucukaslan

@iainelder it is released with v2.1.0.

igungor avatar Jun 19 '23 11:06 igungor

Awesome, thank you both!

Now I can delete object versions like this:

$ s5cmd ls --all-versions "s3://${bucket}"
2023/06/19 12:38:56                 3  key                                                s2K4Rjx_HMjp0k.KnEO9l_yKuh3SXh4P
$ s5cmd rm --all-versions "s3://${bucket}/*"
rm s3://isme-.../key                             s2K4Rjx_HMjp0k.KnEO9l_yKuh3SXh4P
$ s5cmd ls --all-versions "s3://${bucket}"
ERROR "ls --all-versions=true s3://isme-...": no object found

iainelder avatar Jun 19 '23 12:06 iainelder