Ryan Olson
Ryan Olson
TRTIS has an ensemble API which allows you to chain models together.
You might consider: The *new and just released* TensorRT Inference Server for a super simple mechanism for serving TensorRT engine files. https://ngc.nvidia.com/registry/nvidia-inferenceserver https://github.com/NVIDIA/dl-inference-server Or if you are looking for a...
I’ll whip up an example. Help me understand your usecase more and I’ll see if I can get an example that help you get to where you want to go.
The outputs of Model A are the inputs for Model B? How about this for an example: - Decompose ResNet-152 into two TensorRT engines - Model A = base model...
Good question. NvRPC in TRTIS originated from this project. I hope someday the team pulls in the tensorrt runtime.
We should definitely add release tags. I think when I finalize the massive changes on the memory-refactor branch, then it would be a good time to do tagged releases.
@apiguy - thoughts?
I like @aroberts solution in Issue #59 better.
Compatibility with the common `HTTPMethodOverrideMiddleware` would be great. ``` python class HTTPMethodOverrideMiddleware(object): """The HTTPMethodOverrideMiddleware middleware implements the hidden HTTP method technique. Not all web browsers support every HTTP method, such...