Jian Xiao issues

Results 6 issues of


                                            Jian Xiao

Use detached lifetime for stats actor

## Why are these changes needed? The actor handle held at Ray client will become dangling if the Ray cluster is shutdown, and in such case if the user tries...

tests-ok

[Datasets] Revamp the release/nightly_tests/dataset/pipelined_training.py

This test tries to simulate the real workload from users. Right now it's a bit off and we need to make the setup more realistic. Some feedbacks: - The batch...

testing

datasets

[core/docs] Add user guide example on building batch prediction on Ray Core

Signed-off-by: jianoaix [[email protected]](mailto:[email protected]) ## Why are these changes needed? While it's recommended to use Ray Datasets or AIR to build batch prediction, there are use cases where users need to...

[Datasets] Create a separate doc on how to build batch prediction on Datasets

We have a guide now, but it's embedded in NYC taxi data processing example: https://docs.ray.io/en/latest/data/examples/nyc_taxi_basic_processing.html#parallel-batch-inference When users come to Datasets, they may have a workload in mind, so our documentation...

[Datasets] Improve documentation for map_batches()

This is extracting learnings from Data oncall, where we saw user confusions around map_batches(), regarding: - UDF needs to be picklable: this is an implicit requirement so far, and we...

datasets

Add benchmark for iter_batches() API

## Why are these changes needed? This is a core API in Datasets as a way to consume content held in a Dataset. We should add benchmark tests for this...