Inference shouldn't have a timeout

Open PlanetMacro opened this issue 1 year ago • 1 comments

Especially with larger local models on OLLAMA, inference might take some time. Especially on the initial loading of the model. Currently devika throws a timeout, basically rendering its useless for such a setup.

Apr 27 '24 17:04 PlanetMacro

Seems like the timeout is hardcoded in src/llm/llm.py

if int(elapsed_time) == 30:
    emit_agent("inference", {"type": "warning", "message": "Inference is taking longer than expected"})
if elapsed_time > 60:
    raise concurrent.futures.TimeoutError
time.sleep(1)
                    
response = future.result(timeout=60).strip()

As a quick hack to make it work you can increase those values, or you can just comment the fiirst 4 lines and remove the timeout on the last one.

Anyway, i agree. it shouldn't have a timeout, or at least it should be easly configurable to increase/disable it if needed

May 01 '24 00:05 DGdev91