torchchat icon indicating copy to clipboard operation
torchchat copied to clipboard

[WIP] Initial add of distributed model

Open kwen2501 opened this issue 1 year ago • 1 comments

Added files:

  • model_dist.py a mirror of model.py with Tensor Parallelism baked in.
  • dist_run.py toy example of how to run the model in distributed way.

Test:

torchrun --nproc-per-node 2 dist_run.py

kwen2501 avatar Aug 21 '24 17:08 kwen2501

:link: Helpful Links

:test_tube: See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchchat/1044

Note: Links to docs will display an error until the docs builds have been completed.

:white_check_mark: No Failures

As of commit c8dc18a48d66f51855d89294f3ca800692cd5dad with merge base 925b7bd73f110dd1fb378ef80d17f0c6a47031a6 (image): :green_heart: Looks good so far! There are no failures yet. :green_heart:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

pytorch-bot[bot] avatar Aug 21 '24 17:08 pytorch-bot[bot]

Closing in favor of duplicated PR https://github.com/pytorch/torchchat/pull/1059 using ghstack.

kwen2501 avatar Aug 24 '24 06:08 kwen2501