modelmesh-runtime-adapter
modelmesh-runtime-adapter copied to clipboard
Unified runtime-adapter image of the sidecar containers which run in the modelmesh pods
chore: Allow to run make goals provinding the desired Container Enginer builder tool, e.g. `ENGINE=podman make build` #### Motivation #### Modifications #### Result
#### Motivation #### Modifications #### Result
Currently the S3 downloader has a hardcoded limit of 100 files (see: https://github.com/kserve/modelmesh-runtime-adapter/blob/2d5bb69e9ed19efd74fbe6f8b76ec2e970702e3c/pullman/storageproviders/s3/downloader.go#L79C3-L79C27). This means that any model that contains more than 100 files gets cut off at that arbitrary...
Triton provides an extension to the standard gRPC inference api for streaming (`inference.GRPCInferenceService/ModelStreamInfer`), this extension is required to use vLLM backend with triton. However currently the triton runtime adapter does...
#### Motivation Fix #80 #### Modifications Add `ModelStreamInfer` to triton MethodInfos #### Result `inference.GRPCInferenceService/ModelStreamInfer` gRPC requests can be send to triton, which enable the use of triton backends and models...
The current image weight is very high (2.14Gb) which slows down the predictor's uptime. Correct me if I'm wrong please, but the only reason the adapter needs to install tensorflow...
Good afternoon, Thank you very much for creating this amazing framework. I have seen a potential very good feature when doing inference with GPU models. I have seen that the...
**Is your feature request related to a problem? If so, please describe.** Currently, the S3 storage provider uses [static credentials](https://github.com/kserve/modelmesh-runtime-adapter/blob/600f0920d916aa1d901446ddd0ae171980527ad7/pullman/storageproviders/s3/downloader.go#L43-L46) pulled from Kubernetes config. This works great for on-premise Kubernetes...
Hi ModelMesh team, I found it [here](https://github.com/kserve/modelmesh-serving/blob/main/docs/runtimes/custom_runtimes.md#model-server-management-spi) that the model management api is not finalized: > Note that this is currently subject to change, but we will try to ensure...
From https://github.com/kserve/modelmesh-runtime-adapter/pull/68/files#r1403679093: > If we update the function doc strings, we should make sure they read correctly, i.e. lower case the word "Returns" here. Same for the changed doc strings...