dbt-databricks icon indicating copy to clipboard operation
dbt-databricks copied to clipboard

Optimize not running in python models after the first run

Open talperetz1 opened this issue 1 year ago • 3 comments

Describe the bug

When running dbt python model with workflow_job no optimize run after the operation

Steps To Reproduce

Run a python model with liquid cluster by

Expected behavior

Expected to see the following operations: merge, cluster by, optimiz

System information

The output of dbt --version:

Core:
  - installed: 1.9.1
  - latest:    1.9.1

Plugins:
  - databricks: 1.9.1 
  - spark:      1.9.0 

The operating system you're using: linux

The output of python --version: Python 3.11.2

Additional context

Issue Description: I have many SQL models that utilize liquid_cluster_by. The behavior for these models includes merge, cluster by, and optimize operations.

However, when I run Python models, I noticed the following behavior:

On the first run (when creating the table), the optimize operation is executed. On subsequent runs, only the merge operation is performed. If liquid_cluster_by is present, the merge operation is followed by cluster by, which is pointless without optimize. Additionally, I observed that the optimize operation does not execute for Python models, regardless of whether liquid_cluster_by is specified.

def model(dbt, session):

    dbt.config(submission_method='workflow_job')
    dbt.config(materialized='incremental')
    dbt.config(file_format='delta')
    dbt.config(unique_key=['x'])
    dbt.config(liquid_clustered_by=['x'])
    dbt.config(incremental_strategy='merge')
    dbt.config(on_schema_change='append_new_columns')
    dbt.config(location_root='s3://......')

Thanks,

Image

talperetz1 avatar Jan 26 '25 16:01 talperetz1