Nikhil Kulkarni comments

Results 10 comments of


                                            Nikhil Kulkarni

[feature-request] Tensorflow2 support on NVIDIA Triton Inference Containers

Hi @joostwestra, thank you for creating this issue. Could you please share your config.pbtxt file for context?

[bug] Triton BLS scripting fails with `model` not ready

@david-waterworth Right, to load the BLS model explicitly, along with other models that it refers to, they'd have to be loaded at once using the `--load-model` argument. The above hack...

[bug] Triton BLS scripting fails with `model` not ready

@david-waterworth I will look into this, perhaps some ordering issue with the additional_args vs. log_info workaround. Please continue using the workaround. I will update the thread once I have more...

Unable to load/unload models through SageMaker + Triton

@jadhosn Could you share more around the objective you are trying to acheive. And also the exact failure you are seeing? Note that in MME mode, SageMaker will handle model...

Add new Triton DLC URIs

cc : @mufaddal-rohawala for review

[pending-change] Triton Inference Server Image Not Shown (sagemaker-tritonserver:21.08-py3)

All changes for SageMaker are upstreamed to Triton's Github Repo - https://github.com/triton-inference-server/server, and so - The SM-Triton image is essentially the same image as NGC container, with the following backends...

[pending-change] Triton Inference Server Image Not Shown (sagemaker-tritonserver:21.08-py3)

Closing this due to the issue being addressed. For an overview of the Triton Image build, please refer the above comment - https://github.com/aws/deep-learning-containers/issues/1557#issuecomment-1551088683

[question] triton inference server Dockerfile

Hi @geraldstanje we don't support the TRT-LLM container for Triton on SM yet. Most changes to support SageMaker are already upstreamed and the above container should work with SageMaker directly....

[question] triton inference server Dockerfile

@geraldstanje Based on your initial comment, you want to run TRT-LLM on SageMaker, is that correct? I'm trying to say that the nvidia TRT-LLM image will work just fine on...

fix: transform function to support proper batch inference

@chen3933 It does not seem common to implement a predict_fn, input_fn, output_fn that handles only len(data)==1, but if customer has implemented to process only 1 request e.g., any assert check...