ray icon indicating copy to clipboard operation
ray copied to clipboard

Release test chaos_dataset_shuffle_push_based_sort_1tb.aws failed

Open can-anyscale opened this issue 2 years ago • 4 comments

Release test chaos_dataset_shuffle_push_based_sort_1tb.aws failed. See https://buildkite.com/ray-project/release-tests-branch/builds/1770#018899d9-33db-4fdd-a3be-5bd63b4ebf9f for more details. cc @data

 -- created by ray-test-bot

can-anyscale avatar Jun 08 '23 08:06 can-anyscale

FYI, this is an unstable test so I don't know if you want to ignore it. Ignoring it too long will eventually jail the test though. Please see https://www.notion.so/anyscale-hq/OSS-Test-Policy-47d2f1ebae59407eae09a75380f6282b for understanding different test states. Thankks

can-anyscale avatar Jun 08 '23 15:06 can-anyscale

This test, as well as chaos_dataset_shuffle_sort_1tb in https://github.com/ray-project/ray/issues/36195, has been known to be pretty unstable for a while. We are planning to temporarily disable or remove these tests in the near future, when our team does a larger audit of our release tests.

If I'm understanding correctly, if the test fails a few more times, it becomes jailed -- at this point, the test would not be run on a regular basis and only for release validation? Hopefully we will have made a decision on which tests to disable/remove by then.

scottjlee avatar Jun 08 '23 17:06 scottjlee

Thanks @scottjlee, your understanding is 100% correct yes

can-anyscale avatar Jun 08 '23 17:06 can-anyscale

Test has been failing for far too long. Jailing.

can-anyscale avatar Jun 13 '23 10:06 can-anyscale

Also pre-req to #39527 ...

anyscalesam avatar Nov 30 '23 22:11 anyscalesam

Test passed on latest run: https://buildkite.com/ray-project/release/builds/10617#018e12b2-cbae-421d-b405-25b9bcb69c15

can-anyscale avatar Mar 06 '24 08:03 can-anyscale