Scott Lee

Results 23 comments of Scott Lee

@martinbomio looks like there are some failures in the `test_tf_records` case that look relevant ([Buildkite](https://buildkite.com/ray-project/oss-ci-build-pr/builds/19725#0187b8bc-5c43-44b2-a579-5500f314b616)). Looks related to the new `filesystem` parameter we are using in `TFRecordsDatasource`. By the way,...

> it works on the slow path because the `read_stream` receives the file handler, it does not use the path for reading the file. It does not work for the...

> that sounds like a good idea, but unfortunately it won't work because the current code opens the file in the path before calling `read_stream` so if we change that...

@martinbomio I also seem to be running into an issue (possibly related to fast unwrapping logic?) when I try to read from `gs://tfds-data/datasets/natural_questions/0.0.2/natural_questions-train.tfrecord-00000-of-01024`: ``` File "/home/ray/anaconda3/lib/python3.9/site-packages/ray/data/_internal/table_block.py", line 119, in build...

@martinbomio mind giving this [hacky implementation](https://github.com/ray-project/ray/commit/681137c6ab6fbadee72a66a3490658a299df4add) a try with reading GCS files with your team's infrastructure? I want to see if GCS reading works on your end as well. The...

Existing test failures are for tests which previous failed on master: https://buildkite.com/ray-project/oss-ci-build-pr/builds/25054. I will merge latest master, which should resolve these issues. In the meantime, this should be good for...

CI test failures look unrelated to me, also on flakey-tests.ray.io

This test, as well as `chaos_dataset_shuffle_sort_1tb ` in https://github.com/ray-project/ray/issues/36195, has been known to be pretty unstable for a while. We are planning to temporarily disable or remove these tests in...

the PR LGTM, but since it is pretty old, it looks like @jaidisido will need to resolve merge conflicts and merge in latest master. Or @denadai2 if you want to...

Sorry for the delay on this folks, we haven't had bandwidth to look into this code path for quite some time. Other than the tips on [this docs page](https://docs.ray.io/en/latest/data/shuffling-data.html), I...