cog "Already running a prediction" When hitting multiple requests

This is not exactly issue, So the situation is,

I am running COG container locally and I want to process multiple requests at once, however when i hit 100 requests at once it gave me output for 20 and for rest it gave me "Already running a prediction". however my system utilisation was very low How can i do this in parallel.

I am using Image similarity model with VIT and it uses GPU.

Oct 10 '23 06:10 isahillohiya03

Ditto, trying to figure this out as well, using the latest beta version. I've tried by setting threads:

docker run -d -p 5000:5000 <container> python -m cog.server.http --threads=8

But still keep hitting the "Already running a prediction" error.

Nov 03 '23 10:11 jasongrishkoff

I thought it was because of GPU only that we can make one prediction at a time but it's same for CPUs as well

Nov 06 '23 09:11 isahillohiya03

There's a new version https://github.com/replicate/cog/releases/tag/v0.9.0-beta9 that has support for async predictor functions. That might help?

cc @technillogue

Nov 06 '23 22:11 zeke

We hope to roll out concurrent predictions in the next months, but the 0.9.0b9 only allows async def predict, not concurrent predictions.

The threads argument controls how many HTTP requests can be served concurrently, but right now unless I'm mistaken Predictor.predict can still only run one prediction at a time.

Even if that wasn't the case, it's very hard to use torch and get true GPU concurrency without ultimately implementing something like batching or microbatching. For right now, if you can implement batching yourself, that's best.

Nov 07 '23 19:11 technillogue

Okay, thanks for the update. In my case what I've done is set up ~5 docker containers on separate ports, and then used nginx to load balance between them. This allows me to have up to 5 ongoing predictions at any given time.

Nov 07 '23 19:11 jasongrishkoff

@technillogue +1 to concurrent predictions

Dec 30 '23 22:12 vinch00

+1 to concurrent predictions!

May 01 '24 15:05 tripathiarpan20