hive icon indicating copy to clipboard operation
hive copied to clipboard

HIVE-28366: Iceberg: Concurrent Insert and IOW produce incorrect result

Open deniskuzZ opened this issue 1 year ago • 1 comments

What changes were proposed in this pull request?

Introduces the locking to serialize some operation execution

Why are the changes needed?

Fixes the data correctness in case of concurrent Insert and IOW

Does this PR introduce any user-facing change?

No

Is the change a dependency upgrade?

No

How was this patch tested?

ITests

deniskuzZ avatar Aug 06 '24 17:08 deniskuzZ

@deniskuzZ I am not sure if I am catching this right, isn't the problem just that for IOW we used an older snapshotId, why are we going into locking and stuff, Can we just not do some validations during the commit that there is no new snapshot, rather than locking and going sequential.

Any pointers how other engines like spark handle concurrency in such cases?

Impala doesn't handle this. there is no validation in Iceberg API to prevent lost-update in case of IOW and Insert

deniskuzZ avatar Sep 10 '24 13:09 deniskuzZ