maximum response time
I am having a POST function that currently takes up to 20 seconds. I see the request to the endpoint in my dashboard, but the response time seems to be 0 . Is there some kind of upper limit here?
The response time measures the time between your server receiving the request and sending the response back, it doesn't capture any network latency. So it's possible to be zero if it's a quick function, but it sounds like you should be getting a much larger response time, so something doesn't sound right there. There isn't any upper limit set. Which API framework are you using?
It is FastAPI. I am positively sure the function itself takes so long as it's an AI inference task on CPU (which will take ~2 seconds when switching to a GPU later). I get a server response 200 OK back, so the function itself is working fine.
I just tried a few more calls and now it registers a median of ~30k milliseconds. So it does work now. Not sure why it showed 0 at first...
Interesting, so it's measuring it inconsistently, I'll look into that. How are you hosting your API?
My API is hosted in a docker on Huggingface spaces, maybe that has something to do with it?