Neal Vaidya

Results 3 issues of Neal Vaidya

ONNXRuntime has added support in "preview mode" for [CUDA Graphs](https://onnxruntime.ai/docs/performance/tune-performance.html#using-cuda-graphs-in-the-cuda-ep) in the CUDA Execution Provider. It would be useful to expose this option for the onnx runtime Triton backend as...

Adds a Popular Models Guide for e5 models (specifically [`intfloat/e5-large-v2`](https://huggingface.co/intfloat/e5-large-v2)) with TensorRT on Triton.