avizon-aws
avizon-aws
@vkuzo , is there a plan for MXFP8 all-gather? If so, by when is the feature expected to be enabled?
When you say "merge the moe and dense codebases and bring them out of prototype", is that going to be a significant refactoring change? If there are changes will backward...
Thanks for your response, wondering if there is a plan for enabling MXFP8 all gather?
Is there a planned support for MXFP8 all gather as well? I can see that it would be helpful for TP/SP activation all gather, FSDP weight gathers?
Hi @danielvegamyhre , i have created a PR for supporting MXFP8 all gather, would be great if you could review it: https://github.com/pytorch/ao/pull/3435