Neal Vaidya
Results
3
issues of
Neal Vaidya
ONNXRuntime has added support in "preview mode" for [CUDA Graphs](https://onnxruntime.ai/docs/performance/tune-performance.html#using-cuda-graphs-in-the-cuda-ep) in the CUDA Execution Provider. It would be useful to expose this option for the onnx runtime Triton backend as...
Adds a Popular Models Guide for e5 models (specifically [`intfloat/e5-large-v2`](https://huggingface.co/intfloat/e5-large-v2)) with TensorRT on Triton.