torchx icon indicating copy to clipboard operation
torchx copied to clipboard

[Ray] Add elasticity to jobs launched on ray cluster

Open ntlm1686 opened this issue 3 years ago • 1 comments

Elasticity - the execution of placement groups are pending tasks that will be scheduled by GCS when resources become available.

Related PR: #572

Test plan:

Mock cluster scaling with ray.cluster_utils.

ntlm1686 avatar Aug 11 '22 18:08 ntlm1686

Codecov Report

Merging #580 (ac0e8a9) into main (515a265) will increase coverage by 0.18%. The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main     #580      +/-   ##
==========================================
+ Coverage   94.76%   94.94%   +0.18%     
==========================================
  Files          67       67              
  Lines        4047     4134      +87     
==========================================
+ Hits         3835     3925      +90     
+ Misses        212      209       -3     
Impacted Files Coverage Δ
torchx/schedulers/ray_scheduler.py 95.23% <ø> (-0.03%) :arrow_down:
torchx/components/dist.py 96.42% <100.00%> (+7.06%) :arrow_up:
torchx/schedulers/ray/ray_common.py 100.00% <100.00%> (ø)
torchx/schedulers/ray/ray_driver.py 98.27% <100.00%> (+2.44%) :arrow_up:
torchx/specs/api.py 98.40% <100.00%> (+<0.01%) :arrow_up:
torchx/schedulers/kubernetes_scheduler.py 93.80% <0.00%> (-0.15%) :arrow_down:
torchx/schedulers/aws_batch_scheduler.py 89.43% <0.00%> (-0.05%) :arrow_down:
torchx/cli/cmd_list.py 100.00% <0.00%> (ø)
torchx/schedulers/local_scheduler.py 93.12% <0.00%> (ø)
... and 4 more

:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more

codecov[bot] avatar Aug 11 '22 18:08 codecov[bot]

@d4l3k has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot avatar Aug 18 '22 19:08 facebook-github-bot