spark
spark copied to clipboard
[SPARK-49461][SS] Persistent Checkpoint ID to commit logs and read it back
What changes were proposed in this pull request?
In this change, we propose to add a new field to the commit log if STREAMING_STATE_STORE_COMMIT_LOG_VERSION is 2. The new field is a Map[String, Map[String, Map[String, Seq[String]]]] that maps OperatorId -> PartitionId -> StoreName -> Seq[uniqueId]. This is a necessary step to enable the v2 checkpoint.
Why are the changes needed?
New feature
Does this PR introduce any user-facing change?
No
How was this patch tested?
Added UT
Was this patch authored or co-authored using generative AI tooling?
No
cc @siying PTAL!
Same here, I'm merging the PR on behalf of @brkyvz as he asked personally. Just to leave DISCLAIMER.
Thanks! Merging to master.