sqlmesh icon indicating copy to clipboard operation
sqlmesh copied to clipboard

Feat(clickhouse): use table swap insert overwrite instead of delete-insert

Open treysp opened this issue 1 year ago • 0 comments

TODO:

  • [ ] Implement incremental by partition
  • [ ] Add docs

The current Clickhouse adapter uses the delete-insert loading approach. Clickhouse is not designed for deletes/updates, so the performance may be unusably slow.

This PR replaces delete-insert with a table swap insert overwrite approach. At a high level, it operates by creating a temp table that contains new records to insert, adding existing records that should be retained to the temp table, then swapping the temp table for the current table.

To improve performance, the swapping occurs at the partition level for partitioned tables (potentially processing much less data). Incremental by time models now automatically partition the underlying table by week (toMonday(timecol)).

Implementation note:

  • Almost all of the implementation is in the _insert_overwrite_by_condition() adapter method, which is called by other methods like _replace_by_unique_key().

treysp avatar Oct 04 '24 22:10 treysp