amoro icon indicating copy to clipboard operation
amoro copied to clipboard

[AMORO-2819]Spark cannot execute the "alter table set identifier field" command on tables in Iceberg format in unified catalog

Open Aireed opened this issue 1 year ago • 2 comments

Why are the changes needed?

Close #2819 .

Brief change log

  • Fix the issue of creating a mixed format table using "create table like".

  • Shade out the Iceberg classes called by the Amorok Spark extension.

  • The cause: UnifiedSessionCatalog is neither SparkCatalog nor SparkSessionCatalog of Iceberg, so it doesn't match catalogAndIdentifier, and can't get the physical plan image

solution:

  1. copy the iceberg ExtendedDataSourceV2Strategy and override IcebergCatalogAndIdentifier as AmoroExtendedDataSourceV2Strategy
  2. replace ExtendedDataSourceV2Strategy with AmoroExtendedDataSourceV2Strategy in ArcticSparkExtensions

Tips:

  1. ArcticSparkExtensions inject arctic extension and iceberg extension, so We don't need to configure the Iceberg extension in spark.sql.extensions in the future.

How was this patch tested?

  • [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

  • [ ] Add screenshots for manual tests if appropriate

  • [x] Run test locally before making a pull request

Documentation

  • Does this pull request introduce a new feature? (yes / no)
  • If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented)

Aireed avatar May 10 '24 04:05 Aireed

cc @baiyangtx PTAL

Aireed avatar May 10 '24 04:05 Aireed

This PR may cause conflict with https://github.com/apache/amoro/pull/2849.

I suggest that we can wait #2849 is merged

baiyangtx avatar May 24 '24 06:05 baiyangtx