TransformerEngine icon indicating copy to clipboard operation
TransformerEngine copied to clipboard

[Feature Request] export to onnx with float8 type

Open xadupre opened this issue 2 years ago • 3 comments

The latest onnx (1.14) package support float 8 types. Any plan to use them when exporting a model to onnx?

xadupre avatar May 17 '23 17:05 xadupre

Hi @xadupre, yes, this is in progress right now.

I'm not at liberty to discuss the internal plans publicly, but you can review our current unit testing for ONNX export here: test_onnx_export.py This test exports a variety of graphs to ONNX, while utilizing custom plugins for FP8 Quantization.

galagam avatar May 21 '23 04:05 galagam

Any updates? It would be great to have ORT_TENSORRT_FP8_ENABLE

armintoepfer avatar Feb 02 '24 16:02 armintoepfer

TransformerEngine ONNX export is based on TorchScript export. The latest ONNX opset supported in TorchScript is opset 18. Since FP8 data types were first introduced in opset 19, exporting FP8 is not supported using TorchScript. Although no official statement was given, I think it's safe to assume that TorchScript is on the path for deprecation - see this discussion.

TorchDynamo is a new export method, but it is still in beta. https://pytorch.org/docs/stable/onnx_dynamo.html Perhaps @timmoon10 can provide some more infomation about integration of TorchDynamo export.

galagam avatar Feb 04 '24 09:02 galagam