iceberg icon indicating copy to clipboard operation
iceberg copied to clipboard

Spark 3.5.0 `MERGE INTO` breaks

Open bk-mz opened this issue 1 year ago • 7 comments

Apache Iceberg version

1.4.3 (latest release)

Query engine

Spark

Please describe the bug 🐞

Hey folks, when migrating from spark 3.4.1 to spark 3.5.0 we observe broken behavior of MERGE INTO:

merge into database.table base_table
using batch
on
    (base_table.data_load_ts >= TIMESTAMP '2024-02-19 15:00:00.0') and
    (base_table.id = batch.id)
when MATCHED then UPDATE SET *
when not MATCHED then INSERT *

24/02/28 10:37:33 ERROR MicroBatchExecution: Query [id = a43bffba-51a2-47b2-bf32-b978334ea6c9, runId = 7da1911b-8105-4b80-baa9-bf8ec4da9422] terminated with error
org.apache.spark.sql.catalyst.ExtendedAnalysisException: [UNRESOLVED_COLUMN.WITH_SUGGESTION] A column or function parameter with name `base_table`.`data_load_ts` cannot be resolved. Did you mean one of the following? [`base_table`.`data_load_ts`, `batch`.`data_load_ts`, `base_table`.`date_sent`, `base_table`.`date_created`, `base_table`.`date_updated`].; line 4 pos 5;
'MergeIntoTable (('base_table.data_load_ts >= 2024-02-19 15:00:00) AND ('base_table.id = 'batch.id)), [updatestaraction(None)], [insertstaraction(None)]

The same works without problems on 3.4.1.

Please guide me if this is Apache Iceberg issue that you can help with.

Environment:

EMR 7.0.0 Spark 3.5.0 Iceberg 1.4.3

bk-mz avatar Feb 28 '24 18:02 bk-mz

Same issue as reported in https://apache-iceberg.slack.com/archives/C025PH0G1D4/p1704469705606459, can you try removing write.spark.accept-any-schema from table property if it's set?

manuzhang avatar Feb 29 '24 04:02 manuzhang

Thank you! Will try that out and see if it helps.

bk-mz avatar Feb 29 '24 07:02 bk-mz

@aokolnychyi is this something you could potentially take a look at?

nastra avatar Feb 29 '24 08:02 nastra

Confirmed.

spark-sql ()> ALTER TABLE table UNSET TBLPROPERTIES ('write.spark.accept-any-schema');

Though it's clear that:

image

Slack thread link.

bk-mz avatar Feb 29 '24 12:02 bk-mz

I guess we can close this issue cc @nastra ?

bk-mz avatar Feb 29 '24 12:02 bk-mz

Feature request: https://github.com/apache/iceberg/issues/5556

bk-mz avatar Feb 29 '24 13:02 bk-mz

This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs. To permanently prevent this issue from being considered stale, add the label 'not-stale', but commenting on the issue is preferred when possible.

github-actions[bot] avatar Oct 21 '24 00:10 github-actions[bot]

This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale'

github-actions[bot] avatar Nov 05 '24 00:11 github-actions[bot]