Amit Timalsina

Results 2 issues of Amit Timalsina

I have seen most of the people are implementing just the 2 tower architecture for the retrieval model. I wanted to try other architecture. So, if you have implemented or...

#### What does the PR do? This PR adds support for Hermes-style tool calling functionality to the Triton Inference Server OpenAI API frontend. The implementation introduces a new `HermesToolParser` that...