verl icon indicating copy to clipboard operation
verl copied to clipboard

Add instructions on how to run verl on multi-node

Open vermouth1992 opened this issue 9 months ago • 3 comments

  • Point directions for slurm users on how to run ray cluster. https://docs.ray.io/en/latest/cluster/vms/user-guides/community/slurm.html
  • Give examples using ray job submit

vermouth1992 avatar Feb 14 '25 11:02 vermouth1992

1

echo-valor avatar Feb 19 '25 08:02 echo-valor

+1

none0663 avatar Feb 20 '25 03:02 none0663

We're using ray start with kubeflow pytorchjob and verl for multi-node training internally. Can you share more about target contribution and I can see how to contribute.

SwordFaith avatar Feb 27 '25 10:02 SwordFaith