modelmesh-runtime-adapter issues

feat: Make container build engine configurable

2

chore: Allow to run make goals provinding the desired Container Enginer builder tool, e.g. `ENGINE=podman make build` #### Motivation #### Modifications #### Result

spolti

approved

Use latest tag to grab updates upon new build

1

#### Motivation #### Modifications #### Result

spolti

S3 downloader has a hardcoded limit of 100 files

1

Currently the S3 downloader has a hardcoded limit of 100 files (see: https://github.com/kserve/modelmesh-runtime-adapter/blob/2d5bb69e9ed19efd74fbe6f8b76ec2e970702e3c/pullman/storageproviders/s3/downloader.go#L79C3-L79C27). This means that any model that contains more than 100 files gets cut off at that arbitrary...

slik13

Triton RuntimeStatus.MethodInfos is missing ModelStreamInfer

1

Triton provides an extension to the standard gRPC inference api for streaming (`inference.GRPCInferenceService/ModelStreamInfer`), this extension is required to use vLLM backend with triton. However currently the triton runtime adapter does...

Legion2

enhancement

feat: Add ModelStreamInfer to Triton MethodInfos

1

#### Motivation Fix #80 #### Modifications Add `ModelStreamInfer` to triton MethodInfos #### Result `inference.GRPCInferenceService/ModelStreamInfer` gRPC requests can be send to triton, which enable the use of triton backends and models...

Legion2

Reduce size of runtime-adapter image (exclude Python/tensorflow to convert keras models)

1

The current image weight is very high (2.14Gb) which slows down the predictor's uptime. Correct me if I'm wrong please, but the only reason the adapter needs to install tensorflow...

GolanLevy

[Feature Request] Reimplement Load Model of Triton and MLServer

Good afternoon, Thank you very much for creating this amazing framework. I have seen a potential very good feature when doing inference with GPU models. I have seen that the...

WaterKnight1998

enhancement

Feature request: support IAM Roles for Service Accounts

3

**Is your feature request related to a problem? If so, please describe.** Currently, the S3 storage provider uses [static credentials](https://github.com/kserve/modelmesh-runtime-adapter/blob/600f0920d916aa1d901446ddd0ae171980527ad7/pullman/storageproviders/s3/downloader.go#L43-L46) pulled from Kubernetes config. This works great for on-premise Kubernetes...

ianonavy

enhancement

Compatibility matrix of runtime adapters and serving runtimes

1

Hi ModelMesh team, I found it [here](https://github.com/kserve/modelmesh-serving/blob/main/docs/runtimes/custom_runtimes.md#model-server-management-spi) that the model management api is not finalized: > Note that this is currently subject to change, but we will try to ensure...

lizzzcai

documentation

Follow up of doc strings pattern

From https://github.com/kserve/modelmesh-runtime-adapter/pull/68/files#r1403679093: > If we update the function doc strings, we should make sure they read correctly, i.e. lower case the word "Returns" here. Same for the changed doc strings...

spolti

documentation

good first issue

help wanted

modelmesh-runtime-adapter
modelmesh-runtime-adapter copied to clipboard

Metadata

feat: Make container build engine configurable

Use latest tag to grab updates upon new build

S3 downloader has a hardcoded limit of 100 files

Triton RuntimeStatus.MethodInfos is missing ModelStreamInfer

feat: Add ModelStreamInfer to Triton MethodInfos

Reduce size of runtime-adapter image (exclude Python/tensorflow to convert keras models)

[Feature Request] Reimplement Load Model of Triton and MLServer

Feature request: support IAM Roles for Service Accounts

Compatibility matrix of runtime adapters and serving runtimes

Follow up of doc strings pattern

← Metadata

Owner

Metadata

modelmesh-runtime-adapter modelmesh-runtime-adapter copied to clipboard

Metadata

← Metadata

Owner

Metadata

modelmesh-runtime-adapter
modelmesh-runtime-adapter copied to clipboard