llama-cpp-python
llama-cpp-python copied to clipboard
Performance Improvement and Error Handling for API
It gets gateway timeout very often and is there any error handling done.
Increase the timeout maybe.
Also threads are /2. https://github.com/abetlen/llama-cpp-python/blob/main/examples/high_level_api/fastapi_server.py#L31