iceberg icon indicating copy to clipboard operation
iceberg copied to clipboard

Core, Spark: Fallback when snapshot does not have schema id

Open wypoon opened this issue 4 years ago • 4 comments

In SnapshotUtil, when a snapshot does not have a schema id (written before schema id was added to snapshots), we fall back to reading each of the previous metadata files until we find one whose current snapshot id matches the snapshot id we seek, and read its schema from there.

We introduce a setting for testing purposes that makes the SnapshotParser write JSON without schema-id for snapshots. We add variants of existing tests for reading snapshots after schema evolution where the metadata is written without schema-id in the snapshots. The tests fail without the change in SnapshotUtil.

wypoon avatar Jan 06 '22 00:01 wypoon

Hmm, the Flink CI must be flaky. The failing test passes for me locally.

wypoon avatar Jan 06 '22 01:01 wypoon

@rdblue @jackye1995 @yyanyy this is the fallback part of using the snapshot schema when reading a snapshot. (It does not have to make it into 0.13.)

wypoon avatar Jan 06 '22 17:01 wypoon

@rdblue when you have time, can you please review? (This seems to have fallen off the radar.)

wypoon avatar Feb 02 '22 18:02 wypoon

@kbendick I see #4809 and that you reviewed that. Can you please review this? I put this up months ago, but it seems to have fallen off the radar.

wypoon avatar May 19 '22 16:05 wypoon