seafowl icon indicating copy to clipboard operation
seafowl copied to clipboard

Support Delta Tables

Open rupurt opened this issue 3 years ago • 2 comments

Howdy,

Are there any plans to support Delta Tables? This could work really well with GraphQL subscriptions.

rupurt avatar Oct 26 '22 23:10 rupurt

Hey! Do you mean being able to support DataBricks' Delta Tables / Delta Lake (https://github.com/delta-io/delta/blob/master/PROTOCOL.md) as a storage backend / data source for CREATE EXTERNAL tables?

Design-wise, a GraphQL frontend is a sweet idea, though I'm not sure how to make it work well for analytical/aggregation queries (e.g. being able to represent a group by or window on arbitrary columns as a set of supported GraphQL fields). Same with subscriptions -- how would you quickly update a result for AVG(volume) GROUP BY country_id? IIRC ClickHouse/Materialize did some heavy research in that direction -- would indeed be cool to have it also available to Web devs via GQL :)

mildbyte avatar Oct 27 '22 09:10 mildbyte

Yes exactly as a storage backend / data source for CREATE EXTERNAL tables

Design wise I'm not exactly sure how to implement the subscription :) But I feel like there is so much work going into this problem that the solution is right on the cusp of being implemented (e.g. ClickHouse/Materialize/Delta Tables). FWIW there is now a Delta Table implementation in rust and it can do streaming updates https://github.com/delta-io/delta-rs/tree/main/rust.

rupurt avatar Oct 27 '22 14:10 rupurt

@rupurt thanks for the very cool ideas! :)

As for using Delta tables for our storage backend/layer (i.e. replacing our DIY lakehouse protocol with the Delta one using delta-rs), this is something that we'll likely converge towards at some point later on.

For now though, with the latest Seafowl version (0.2.10) you should be able to instantiate the delta tables stored in various cloud object stores as an external table (will be placed in the staging schema) and query them.

gruuya avatar Dec 31 '22 06:12 gruuya

Amazing. Thank you @gruuya

rupurt avatar Dec 31 '22 23:12 rupurt

As for using Delta tables for our storage backend/layer (i.e. replacing our DIY lakehouse protocol with the Delta one using delta-rs), this is something that we'll likely converge towards at some point later on.

I'm happy to say that we've completed this migration, so this issue can be closed now. Thanks for a great idea @rupurt !

gruuya avatar Mar 23 '23 07:03 gruuya