lucas
lucas
Maybe this the sustainable answer for JAX on multi-node, multi-gpu. [JAX Container Early Access](https://developer.nvidia.com/jax-container-early-access)
Facing the same issue.
Current features are delightful to use. Would be great to have GPU support. Thanks for your work.
Oh my lol. Thanks for pointing that out.
Got the same error as well with both `web` and `Public client/native`. Start digging.
Gave up and use `mc_port`. Works.
Don't know the reason. I saw in other issue that the author mentioned you have to restart the process manually if the process is disconnected somehow with `mc_port`. Don't know...
Same problem here. > Thanks, `deepspeed==0.12.6` works for me in my local setting, so basically I found that if you need to train the model, running the below command is...
> Confirmed. It seems the issue is several commits back. The segfault occurs when I do `git checkout eef3347194432e537e15a0fa083bc780465f8cfc`, but not when I do `git checkout 492a2e3d37711369f1d3d59682d6b6011f48cf9b`. It seems that...
Looking forward to this!!