TensorRT icon indicating copy to clipboard operation
TensorRT copied to clipboard

[Feature Request] MoE plugin for onnx

Open nan2088 opened this issue 1 year ago • 1 comments

There are many models use MoE, for instance:

https://github.com/PKU-YuanGroup/MoE-LLaVA/blob/0549ce0e65119858399d2e4e88ddb4cd3db4c133/moellava/model/language_model/llava_stablelm_moe.py#L483

It would be great if the model can be exported to onnx with custom onnx node, and tensorrt can support such plugin.

TensorRT-LLM has such plugin. Is it possible to make a general MoE plugin for TensorRT?(with out TensorRT-LLM, in line with deepspeed's MoE)

PS. MoE in onnx : https://github.com/microsoft/onnxruntime/blob/884acd4598a437521921dfdec596923afa3f4ed1/docs/ContribOperators.md#commicrosoftmoe

nan2088 avatar May 24 '24 06:05 nan2088

@rajeevsrao ^ ^

zerollzeng avatar May 27 '24 01:05 zerollzeng