Changes to support latent MoEs
Thanks for the work. Could you also add UT and integration tests covering this feature combined with EP/TP?
This pull request requires additional validation before any workflows can run on NVIDIA's runners.
Pull request vetters can view their responsibilities here.
Contributors can view more details about this message here.
/ok to test dd645a8e848c92fc93037e6e6bfc729fa72d3f11
/ok to test 80f1d5f
/ok to test 4257d6a
/ok to test 44b708d
/ok to test ec588117fd113a98030d84765240a6780027add6
/ok to test ec588117fd113a98030d84765240a6780027add6