iceberg icon indicating copy to clipboard operation
iceberg copied to clipboard

Support for Flink's SpeculativeExecution in batch execution mode

Open venkata91 opened this issue 1 year ago • 1 comments

Summary

Add support for Flink's Speculative Execution in batch execution mode

Testing

Existing tests should take care of it

venkata91 avatar Jun 21 '24 18:06 venkata91

cc @stevenzwu for review. Should this change also be made in other Flink versions like Flink-1.17 and Flink-1.18?

venkata91 avatar Jun 29 '24 04:06 venkata91

@venkata91: How can we be sure that the tests are exercising the speculative execution code path?

Does any of the tests reads some splits multiple times, and use the result of the faster one?

I think it would be useful to have a test demonstrating that the behavior works, to prevent disabling it by an unrelated change by accident.

pvary avatar Jul 03 '24 19:07 pvary

@venkata91: How can we be sure that the tests are exercising the speculative execution code path?

Does any of the tests reads some splits multiple times, and use the result of the faster one?

I think it would be useful to have a test demonstrating that the behavior works, to prevent disabling it by an unrelated change by accident.

Sure sounds good. Will add a test.

venkata91 avatar Jul 09 '24 22:07 venkata91

@venkata91: How can we be sure that the tests are exercising the speculative execution code path?

Does any of the tests reads some splits multiple times, and use the result of the faster one?

I think it would be useful to have a test demonstrating that the behavior works, to prevent disabling it by an unrelated change by accident.

@pvary Added an integration test to verify the tasks are speculated and produces the expected output. PTAL. btw, should this change also be made in other Flink versions like Flink-1.17 and Flink-1.18?

venkata91 avatar Jul 16 '24 05:07 venkata91

Gentle ping for reviews cc @pvary @stevenzwu Thanks!

venkata91 avatar Jul 19 '24 18:07 venkata91

@pvary should this change be ported to other flink versions like 1.17 and 1.18?

venkata91 avatar Jul 22 '24 18:07 venkata91

@pvary should this change be ported to other flink versions like 1.17 and 1.18?

In a follow-up backport pr

pvary avatar Jul 23 '24 14:07 pvary

Merged to main. Thanks for the PR @venkata91!

Could you please create the backport PR to the other Flink versions? The PR could be generated like this:

git diff <HASH>^ <HASH> | sed "s/v1.19/v1.18/g" | git apply -3 -p1
git diff <HASH>^ <HASH> | sed "s/v1.19/v1.17/g" | git apply -3 -p1

pvary avatar Jul 24 '24 05:07 pvary

Could you please create the backport PR to the other Flink versions?

@pvary Should we have 2 backport PRs one for 1.17 and 1.18 or is it fine to do it in a single PR?

venkata91 avatar Jul 24 '24 16:07 venkata91

Could you please create the backport PR to the other Flink versions?

@pvary Should we have 2 backport PRs one for 1.17 and 1.18 or is it fine to do it in a single PR?

A single PR would be fine, as we don't expect any serious changes to review

pvary avatar Jul 24 '24 19:07 pvary