ray
ray copied to clipboard
[Train] Support full `ray.get_gpu_ids()` API in `train.torch.get_device()`
Signed-off-by: Amog Kamsetty [email protected]
ray.get_gpu_ids() sometimes returns a list of ints or sometimes returns a list of strings depending on if the user has set the CUDA_VISIBLE_DEVICES environment variable. This has led to the following issue: https://github.com/ray-project/ray/issues/28467.
It seems like solidifying the Ray Core API will require more discussion (follow the thread here: https://github.com/ray-project/ray/pull/28632), so we temporarily account for this in Ray Train for now.
Closes https://github.com/ray-project/ray/issues/28467
Why are these changes needed?
Related issue number
Checks
- [ ] I've signed off every commit(by using the -s flag, i.e.,
git commit -s) in this PR. - [ ] I've run
scripts/format.shto lint the changes in this PR. - [ ] I've included any doc changes needed for https://docs.ray.io/en/master/.
- [ ] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
- [ ] Unit tests
- [ ] Release tests
- [ ] This PR is not tested :(