spark icon indicating copy to clipboard operation
spark copied to clipboard

[SPARK-49006] Implement purging for OperatorStateMetadataV2 and StateSchemaV3 files

Open ericm-db opened this issue 1 year ago • 0 comments

What changes were proposed in this pull request?

Currently, OperatorStateMetadataV2 and StateSchemaV3 files are written for every new query run. This PR will implement purging files so we only keep minLogEntriesToMaintain files per query.

Why are the changes needed?

These changes are needed so that we don't indefinitely keep these files across many query runs, bounding the number of state files we keep

Does this PR introduce any user-facing change?

No

How was this patch tested?

Added unit tests

Was this patch authored or co-authored using generative AI tooling?

No

ericm-db avatar Jul 25 '24 16:07 ericm-db