ray icon indicating copy to clipboard operation
ray copied to clipboard

[Train] Support full `ray.get_gpu_ids()` API in `train.torch.get_device()`

Open amogkam opened this issue 3 years ago • 0 comments

Signed-off-by: Amog Kamsetty [email protected]

ray.get_gpu_ids() sometimes returns a list of ints or sometimes returns a list of strings depending on if the user has set the CUDA_VISIBLE_DEVICES environment variable. This has led to the following issue: https://github.com/ray-project/ray/issues/28467.

It seems like solidifying the Ray Core API will require more discussion (follow the thread here: https://github.com/ray-project/ray/pull/28632), so we temporarily account for this in Ray Train for now.

Closes https://github.com/ray-project/ray/issues/28467

Why are these changes needed?

Related issue number

Checks

  • [ ] I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • [ ] I've run scripts/format.sh to lint the changes in this PR.
  • [ ] I've included any doc changes needed for https://docs.ray.io/en/master/.
  • [ ] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • [ ] Unit tests
    • [ ] Release tests
    • [ ] This PR is not tested :(

amogkam avatar Sep 21 '22 04:09 amogkam