accelerate icon indicating copy to clipboard operation
accelerate copied to clipboard

Feature request: FSDP for TPUs

Open OhadRubin opened this issue 3 years ago • 7 comments

A recent contribution to the pytorch_xla repo allows using FSDP in PyTorch XLA for sharding Module parameters across data-parallel workers. https://github.com/pytorch/xla/pull/3431 Some motivation behind this: It may be possible perform inference with OPT 30B on Google Colab without needing a Pro subscription, which I think many people will appreciate. What will be needed to add it to accelerate?

OhadRubin avatar Jun 02 '22 10:06 OhadRubin

Once the next release of PyTorch XLA is out, we'll start taking a look at this

muellerzr avatar Jun 02 '22 11:06 muellerzr

Hey @muellerzr, is there ongoing work for adding XLA support to FSDP? We, on the AWS SageMaker training compiler side, have started looking into XLA-FSDP and might be able to contribute to adding such support to accelerate.

Vatshank avatar Nov 03 '22 00:11 Vatshank

@Vatshank not yet! It's the next thing on my list to get to after TPU pod support, so would love the help if you guys can! 🙏

muellerzr avatar Nov 03 '22 00:11 muellerzr

Okay cool @muellerzr! Although our focus is on GPUs, I am sure there will be significant overlap in the code for adding support for either device type.

What do you think would be a good way to discuss some of these implementation details? If you guys have a shared Slack group for development, for instance. Also happy to continue to bug you on GitHub, if that's preferred :)

Vatshank avatar Nov 03 '22 01:11 Vatshank

@Vatshank this gh issue should be fine!

muellerzr avatar Nov 03 '22 15:11 muellerzr

@AlexWertheim With your recent pr can we call this request done?

JackCaoG avatar Jun 16 '23 22:06 JackCaoG

@AlexWertheim With your recent pr can we call this request done?

Yeah, I think so. For reference, the PR in question can be seen here. @muellerzr can say better than I can whether this fulfills all requirements where accelerate is concerned.

AlexWertheim avatar Jun 16 '23 22:06 AlexWertheim