modelmesh icon indicating copy to clipboard operation
modelmesh copied to clipboard

Distributed Model Serving Framework

Results 23 modelmesh issues
Sort by recently updated
recently updated
newest added

#### Motivation When the modelmesh is not able to connect to the kv store to update its instance recording or sync with the other instances it can not reliably serve...

do-not-merge/work-in-progress

I'm looking for an option to configure request timeouts for inference requests. Either a global or a per request timeout would be nice. Currently we are experiencing many "stuck" inference...

I'm currently trying to setup streaming reponses of LLM generation from vLLM, however I receive an `Streaming not yet supported` error from modelmesh. I think this is coming from this...

It would also be useful to have a unit test for this, but the tests included here don't exercise the actual bug. Ideally we'd have a test that actually runs...

test

It should be possible to use `https` for `RemotePayloadProcessor` to communicate to consumers of MM `Payloads`.

enhancement

Now model load only on one instance, and lazy loading on another pods, when reauest has come. Can we modify internal modelmesh parameters for default loading model on all ServingRuntime...

question

Can you describe steps please, for running modelmesh locally with runtime adapter, etcd and some serving? It needs for local debugging and clarifying some logic of work

documentation
question

ServingRuntime: `torchserve` ### Current behavior * sent requests with client timeouts (load our modelmesh) * after some time, client starts to receive ``` ERROR: Code: Internal Message: org.pytorch.serve.grpc.inference.InferenceAPIsService/Predictions: INTERNAL: Model...

bug

I am new with modelmesh but so interested in this project. Could we deploy modelmesh using Docker only without k8s cluster? Thanks

documentation
question