Sowmya Yellapragada

Results 4 comments of Sowmya Yellapragada

I ran into a similar issue when deploying the torch server to Kubernetes. It turned out to be due to OOM. Increasing the memory limits fixed the issue for me.

@ramonpzg I am running a load test with [locust](https://locust.io/). It spawns multiple workers to send the requests, and all the requests have the same input values.

> Hey @sowmyay , > > For Adaptive Batching to work, the model needs to return shapes compatible with the number of batches. In general, it will assume that the...

@adriangonz From the docs, I expected adaptive batching to work as follows, for my use case - 1. MLServer receives requests of dimensions [1, ....] 2. Within the acceptable `MLSERVER_MODEL_MAX_BATCH_TIME`,...