sqlmesh icon indicating copy to clipboard operation
sqlmesh copied to clipboard

Support S3Tables

Open erindru opened this issue 7 months ago • 5 comments

AWS has introduced a feature called S3 Tables where specific table buckets in S3 are designated for storage of data in lakehouse formats (eg Iceberg). This brings some benefits:

  • AWS can perform automatic data maintenance (compaction etc)
  • The object storage is optimized for lakehouse table access patterns
  • The tables are registered in the Glue Data Catalog so other engines can query them

The following engines support them in some capacity:

Currently SQLMesh has some problems managing tables stored in S3 Table buckets. In particular, they are organised into "namespaces" which are conceptually like a schema but use different syntax for creating / dropping them.

This ticket is for adding / verifying support and documenting usage as some of the engines only have read-only support

erindru avatar Jun 11 '25 20:06 erindru

@erindru thanks for creating this ticket as this would be pivotal when moving forward with SQLMesh / s3Tables. Is there any updated on the progress so far? Thanks again!

gihants avatar Jun 22 '25 22:06 gihants

We have not started working on this issue yet. You're welcome to work on it yourself if it's blocking you, we are always happy to accept well-formed PR's

erindru avatar Jun 23 '25 00:06 erindru

Also interested in this feature. A couple of things about S3 Tables that might be relevant here:

  • According to AWS docs, CREATE DATABASE is supported via Athena to create the namespace, within the S3 Table catalog.
  • That doc also notes CTAS queries are not supported for S3 Tables.
  • Looks like catalog names have / in them, e.g. s3tablescatalog/amzn-s3-demo-bucket.

sean-eyre avatar Aug 07 '25 14:08 sean-eyre

AWS announced that CTAS is now supported for S3 Tables via Athena

https://aws.amazon.com/about-aws/whats-new/2025/08/amazon-athena-create-table-select-amazon-s3-tables/

sean-eyre avatar Sep 04 '25 01:09 sean-eyre

Is there a way to define schemas in AWS S3Tables/Glue Data Catalog using Athena then query using Spark?

KentonParton avatar Sep 26 '25 12:09 KentonParton