spark
spark copied to clipboard
[SPARK-51745] Enforce State Machine for RocksDBStateStore
What changes were proposed in this pull request?
This PR introduces state machine validation to RocksDBStateStore to enforce proper usage patterns. It adds:
- Explicit state machine transitions between UPDATING, COMMITTED, and ABORTED states
- Validation logic to ensure operations are only executed in appropriate states
- Automatic cleanup via task completion listeners
- Error handling improvements for state store maintenance
- Refactoring of unload operation to ensure it runs only on maintenance threads
The implementation ensures that:
- No updates can be made after a store has been committed or aborted
- Metrics can only be accessed after commit/abort
- All operations validate the state before execution
- Task thread cleanup properly triggers maintenance for unloaded providers
Why are the changes needed?
RocksDBStateStore has implicit usage requirements that weren't being enforced. This could lead to incorrect usage patterns, potential data corruption, and hard-to-debug issues.
Does this PR introduce any user-facing change?
No
How was this patch tested?
Unit tests have been added and existing tests modified to validate state transitions and error cases. The PR includes a new test case specifically for SPARK-51596 (unloading only occurs on maintenance thread but occurs promptly).
Was this patch authored or co-authored using generative AI tooling?
No