Spark 3.5.0 `MERGE INTO` breaks
Apache Iceberg version
1.4.3 (latest release)
Query engine
Spark
Please describe the bug 🐞
Hey folks, when migrating from spark 3.4.1 to spark 3.5.0 we observe broken behavior of MERGE INTO:
merge into database.table base_table
using batch
on
(base_table.data_load_ts >= TIMESTAMP '2024-02-19 15:00:00.0') and
(base_table.id = batch.id)
when MATCHED then UPDATE SET *
when not MATCHED then INSERT *
24/02/28 10:37:33 ERROR MicroBatchExecution: Query [id = a43bffba-51a2-47b2-bf32-b978334ea6c9, runId = 7da1911b-8105-4b80-baa9-bf8ec4da9422] terminated with error
org.apache.spark.sql.catalyst.ExtendedAnalysisException: [UNRESOLVED_COLUMN.WITH_SUGGESTION] A column or function parameter with name `base_table`.`data_load_ts` cannot be resolved. Did you mean one of the following? [`base_table`.`data_load_ts`, `batch`.`data_load_ts`, `base_table`.`date_sent`, `base_table`.`date_created`, `base_table`.`date_updated`].; line 4 pos 5;
'MergeIntoTable (('base_table.data_load_ts >= 2024-02-19 15:00:00) AND ('base_table.id = 'batch.id)), [updatestaraction(None)], [insertstaraction(None)]
The same works without problems on 3.4.1.
Please guide me if this is Apache Iceberg issue that you can help with.
Environment:
EMR 7.0.0 Spark 3.5.0 Iceberg 1.4.3
Same issue as reported in https://apache-iceberg.slack.com/archives/C025PH0G1D4/p1704469705606459, can you try removing write.spark.accept-any-schema from table property if it's set?
Thank you! Will try that out and see if it helps.
@aokolnychyi is this something you could potentially take a look at?
Confirmed.
spark-sql ()> ALTER TABLE table UNSET TBLPROPERTIES ('write.spark.accept-any-schema');
Though it's clear that:
Slack thread link.
I guess we can close this issue cc @nastra ?
Feature request: https://github.com/apache/iceberg/issues/5556
This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs. To permanently prevent this issue from being considered stale, add the label 'not-stale', but commenting on the issue is preferred when possible.
This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale'