[SPARK-47052] Separate state tracking variables from MicroBatchExecution/StreamExecution

Open jerrypeng opened this issue 1 year ago • 0 comments

What changes were proposed in this pull request?

To improve code clarity and maintainability, I propose that we move all the variables that track mutable state and metrics for a streaming query into a separate class. With this refactor, it would be easy to track and find all the mutable state a microbatch can have.

Why are the changes needed?

To improve code clarity and maintainability. All the state and metrics that is needed for the execution lifecycle of a microbatch is consolidated into one class. If we decide to modify or add additional state to a streaming query, it will be easier to determine 1) where to add it 2) what existing state are there.

Does this PR introduce any user-facing change?

How was this patch tested?

Existing tests should suffice

Was this patch authored or co-authored using generative AI tooling?

Feb 14 '24 23:02 jerrypeng