KrishnanPrash
KrishnanPrash
#### What does the PR do? This PR is for the `tritonfrontend` python package, containing bindings to the `HTTPAPIServer` and `grpc::Server` classes in the frontend. This allows Triton users to...
#### What does the PR do? Abstracts error handling in the `tritonfrontend/_api` away with the decorator `@handle_triton_error`. This removes the need to wrap every function in `_api/_.py` with a `try:...
#### What does the PR do? Adding support for `Metrics` in `tritonfrontend`. This involves two components: - In tritonfrontend_pybind.cc, added bindings for `HTTPMetricsServer` - In tritonfrontend/_api/_metrics.py, added a `Metrics` class...
Updated the PyBind11 version from `v2.10.0` to `v2.12.0`, because `v2.12` supports `numpy 2` ([Release Notes](https://github.com/pybind/pybind11/releases/tag/v2.12.0)). Before this update, if `numpy>=2.x` was installed in the same environment, performing inference requests would...
This PR is meant to address a silent error not being caught in `tritonclient.http.is_server_live()` and `tritonclient.http.is_server_ready()`. Currently if we start a tritonserver instance with a health [restricted feature specification](https://github.com/triton-inference-server/server/blob/main/docs/customization_guide/inference_protocols.md#limit-endpoint-access-beta), with...