iceberg Core, Spark: Fallback when snapshot does not have schema id

In SnapshotUtil, when a snapshot does not have a schema id (written before schema id was added to snapshots), we fall back to reading each of the previous metadata files until we find one whose current snapshot id matches the snapshot id we seek, and read its schema from there.

We introduce a setting for testing purposes that makes the SnapshotParser write JSON without schema-id for snapshots. We add variants of existing tests for reading snapshots after schema evolution where the metadata is written without schema-id in the snapshots. The tests fail without the change in SnapshotUtil.

Jan 06 '22 00:01 wypoon

Hmm, the Flink CI must be flaky. The failing test passes for me locally.

Jan 06 '22 01:01 wypoon

@rdblue @jackye1995 @yyanyy this is the fallback part of using the snapshot schema when reading a snapshot. (It does not have to make it into 0.13.)

Jan 06 '22 17:01 wypoon

@rdblue when you have time, can you please review? (This seems to have fallen off the radar.)

Feb 02 '22 18:02 wypoon

@kbendick I see #4809 and that you reviewed that. Can you please review this? I put this up months ago, but it seems to have fallen off the radar.

May 19 '22 16:05 wypoon