Support for Flink's SpeculativeExecution in batch execution mode
Summary
Add support for Flink's Speculative Execution in batch execution mode
Testing
Existing tests should take care of it
cc @stevenzwu for review. Should this change also be made in other Flink versions like Flink-1.17 and Flink-1.18?
@venkata91: How can we be sure that the tests are exercising the speculative execution code path?
Does any of the tests reads some splits multiple times, and use the result of the faster one?
I think it would be useful to have a test demonstrating that the behavior works, to prevent disabling it by an unrelated change by accident.
@venkata91: How can we be sure that the tests are exercising the speculative execution code path?
Does any of the tests reads some splits multiple times, and use the result of the faster one?
I think it would be useful to have a test demonstrating that the behavior works, to prevent disabling it by an unrelated change by accident.
Sure sounds good. Will add a test.
@venkata91: How can we be sure that the tests are exercising the speculative execution code path?
Does any of the tests reads some splits multiple times, and use the result of the faster one?
I think it would be useful to have a test demonstrating that the behavior works, to prevent disabling it by an unrelated change by accident.
@pvary Added an integration test to verify the tasks are speculated and produces the expected output. PTAL. btw, should this change also be made in other Flink versions like Flink-1.17 and Flink-1.18?
Gentle ping for reviews cc @pvary @stevenzwu Thanks!
@pvary should this change be ported to other flink versions like 1.17 and 1.18?
@pvary should this change be ported to other flink versions like
1.17and1.18?
In a follow-up backport pr
Merged to main. Thanks for the PR @venkata91!
Could you please create the backport PR to the other Flink versions? The PR could be generated like this:
git diff <HASH>^ <HASH> | sed "s/v1.19/v1.18/g" | git apply -3 -p1
git diff <HASH>^ <HASH> | sed "s/v1.19/v1.17/g" | git apply -3 -p1
Could you please create the backport PR to the other Flink versions?
@pvary Should we have 2 backport PRs one for 1.17 and 1.18 or is it fine to do it in a single PR?
Could you please create the backport PR to the other Flink versions?
@pvary Should we have 2 backport PRs one for 1.17 and 1.18 or is it fine to do it in a single PR?
A single PR would be fine, as we don't expect any serious changes to review