iceberg icon indicating copy to clipboard operation
iceberg copied to clipboard

deletion & purge improvements for undelete feature in REST catalog

Open twuebi opened this issue 1 year ago • 0 comments

Feature Request / Improvement

Hi all,

while working on an un-drop feature for our catalog implementation, I noticed that DROP TABLE some_tab PURGE executed via spark 3.5 is not forwarding purgeRequested to the REST catalog but instead either attempts to perform all deletions on its own or, in presence of s3.delete_enabled: false, just ends up calling DELETE without the purgeRequested query parameter.

We would like to have a handle to prevent clients from deleting files since we'd like to add a undrop feature to the management apis of our catalog.

I'm thinking in the direction of adding a new property which disables client-side deletes. As far as I understand, this currently only exists in the S3FileIO properties, the new property could be a top-level property which would apply to other FileIOs as well.

If this property is returned as an override:

  • purgeRequested will be sent to the REST Catalog in case of PURGE
  • clients will not attempt to delete any metadata files / data files on receiving DROP

Kind regards, Tobias

Query engine

Spark

Willingness to contribute

  • [ ] I can contribute this improvement/feature independently
  • [X] I would be willing to contribute this improvement/feature with guidance from the Iceberg community
  • [ ] I cannot contribute this improvement/feature at this time

twuebi avatar Aug 27 '24 16:08 twuebi