Colin
Colin
https://github.com/onnx/tensorflow-onnx/blob/1528091559b5246207c09cccc45a33e671b1f662/tf2onnx/rewriter/gemm_rewriter.py#L74
Einsum is already supported by most frameworks. Such as [trt](https://github.com/NVIDIA/TensorRT/issues/1617#issuecomment-992266722) ,[OpenVino](https://docs.openvino.ai/2022.3/openvino_docs_ops_matrix_Einsum_7.html), [onnx](https://onnx.ai/onnx/operators/onnx__Einsum.html). Therefore, it is no longer necessary to implement Einsum by combining operators, which may cause BF16 precision to...
Details refer to this [issues](https://github.com/tensorflow/tensorflow/issues/73922)
xla is very fast, but it requires padding to respond to changes in online service requests
**Description** I used the latest image version 24.06 because the corresponding latest version of trt has support for BF16. But when I deploy the model with trt-backend. I used perf_analyze...
https://github.com/liuzengh/design-pattern/blob/d2b299088ee87df9508ad11f37798a3b919f5815/README.md?plain=1#L62