Harshini Komali

Results 6 issues of Harshini Komali

Related PR: https://github.com/triton-inference-server/core/pull/338 Changes: Updated DetermineStatsModelVersion(), MergeStatistics() functions to handle cache hit scenario when ensemble top request is cached due to which composing models are not executed. Tests for DetermineStatsModelVersion()

ref slack thread: https://nvidia.slack.com/archives/CAZKCU4UV/p1677717244222069 Currently caching at the top-level request sent to ensemble scheduler is not supported. Implemented caching top level requests for ensemble models. In case of cache hit,...

Related PR: https://github.com/triton-inference-server/core/pull/338 Added 4 new tests in L0_response_cache to test top level request caching for ensemble models Test 1: When cache and decoupled enabled in ensemble model config: Error...

Top level response caching for ensemble models

This check is needed so that test_inference_profiler unit test doesn't fail.

Added version folder in ensemble_model in side the model repository. Changed shm-size from 256m to 1G. These changes are required to run the ensemble model example on Triton 23.12.