dify QA Mode HTTP Timeout

Self Checks

[X] This is only for bug report, if you would like to ask a question, please head to Discussions.
[X] I have searched for existing issues search for existing issues, including closed ones.
[X] I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
[X] 请务必使用英文提交 Issue，否则会被关闭。谢谢！:）
[X] Please do not modify this template :) and fill in all the required fields.

Dify version

0.6.14

Cloud or Self Hosted

Self Hosted (Docker)

Steps to reproduce

I use Xinference to provide model, when using the knowledge base, everything works fine if i don't use QA mode. But when using QA mode, even small files will http timeout (my graphics card is RTX 4060) I have tried to modify GUNICORN_TIMEOUT=3600, but the http timeout still occurs after 3 to 4 minutes. The following is a screenshot of the error log

✔️ Expected Behavior

No response

❌ Actual Behavior

2024-07-18 15:49:37 api-1         | 2024-07-18 07:49:37,557.557 ERROR [Dummy-1] [app.py:838] - Exception on /console/api/datasets/4ee1992d-50d9-4129-ad86-3536305a10c1/batch/20240718074530132385/indexing-estimate [GET]
2024-07-18 15:49:37 api-1         | Traceback (most recent call last):
2024-07-18 15:49:37 api-1         |   File "/app/api/.venv/lib/python3.10/site-packages/httpx/_transports/default.py", line 69, in map_httpcore_exceptions
2024-07-18 15:49:37 api-1         |     yield
2024-07-18 15:49:37 api-1         |   File "/app/api/.venv/lib/python3.10/site-packages/httpx/_transports/default.py", line 233, in handle_request
2024-07-18 15:49:37 api-1         |     resp = self._pool.handle_request(req)
2024-07-18 15:49:37 api-1         |   File "/app/api/.venv/lib/python3.10/site-packages/httpcore/_sync/connection_pool.py", line 216, in handle_request
2024-07-18 15:49:37 api-1         |     raise exc from None
2024-07-18 15:49:37 api-1         |   File "/app/api/.venv/lib/python3.10/site-packages/httpcore/_sync/connection_pool.py", line 196, in handle_request
2024-07-18 15:49:37 api-1         |     response = connection.handle_request(
2024-07-18 15:49:37 api-1         |   File "/app/api/.venv/lib/python3.10/site-packages/httpcore/_sync/connection.py", line 101, in handle_request
2024-07-18 15:49:37 api-1         |     return self._connection.handle_request(request)
2024-07-18 15:49:37 api-1         |   File "/app/api/.venv/lib/python3.10/site-packages/httpcore/_sync/http11.py", line 143, in handle_request
2024-07-18 15:49:37 api-1         |     raise exc
2024-07-18 15:49:37 api-1         |   File "/app/api/.venv/lib/python3.10/site-packages/httpcore/_sync/http11.py", line 113, in handle_request
2024-07-18 15:49:37 api-1         |     ) = self._receive_response_headers(**kwargs)
2024-07-18 15:49:37 api-1         |   File "/app/api/.venv/lib/python3.10/site-packages/httpcore/_sync/http11.py", line 186, in _receive_response_headers
2024-07-18 15:49:37 api-1         |     event = self._receive_event(timeout=timeout)
2024-07-18 15:49:37 api-1         |   File "/app/api/.venv/lib/python3.10/site-packages/httpcore/_sync/http11.py", line 224, in _receive_event
2024-07-18 15:49:37 api-1         |     data = self._network_stream.read(
2024-07-18 15:49:37 api-1         |   File "/app/api/.venv/lib/python3.10/site-packages/httpcore/_backends/sync.py", line 124, in read
2024-07-18 15:49:37 api-1         |     with map_exceptions(exc_map):
2024-07-18 15:49:37 api-1         |   File "/usr/local/lib/python3.10/contextlib.py", line 153, in __exit__
2024-07-18 15:49:37 api-1         |     self.gen.throw(typ, value, traceback)
2024-07-18 15:49:37 api-1         |   File "/app/api/.venv/lib/python3.10/site-packages/httpcore/_exceptions.py", line 14, in map_exceptions
2024-07-18 15:49:37 api-1         |     raise to_exc(exc) from exc
2024-07-18 15:49:37 api-1         | httpcore.ReadTimeout: timed out
2024-07-18 15:49:37 api-1         | 
2024-07-18 15:49:37 api-1         | The above exception was the direct cause of the following exception:
2024-07-18 15:49:37 api-1         | 
2024-07-18 15:49:37 api-1         | Traceback (most recent call last):
2024-07-18 15:49:37 api-1         |   File "/app/api/.venv/lib/python3.10/site-packages/openai/_base_client.py", line 952, in _request
2024-07-18 15:49:37 api-1         |     response = self._client.send(
2024-07-18 15:49:37 api-1         |   File "/app/api/.venv/lib/python3.10/site-packages/httpx/_client.py", line 914, in send
2024-07-18 15:49:37 api-1         |     response = self._send_handling_auth(
2024-07-18 15:49:37 api-1         |   File "/app/api/.venv/lib/python3.10/site-packages/httpx/_client.py", line 942, in _send_handling_auth
2024-07-18 15:49:37 api-1         |     response = self._send_handling_redirects(
2024-07-18 15:49:37 api-1         |   File "/app/api/.venv/lib/python3.10/site-packages/httpx/_client.py", line 979, in _send_handling_redirects
2024-07-18 15:49:37 api-1         |     response = self._send_single_request(request)
2024-07-18 15:49:37 api-1         |   File "/app/api/.venv/lib/python3.10/site-packages/httpx/_client.py", line 1015, in _send_single_request
2024-07-18 15:49:37 api-1         |     response = transport.handle_request(request)
2024-07-18 15:49:37 api-1         |   File "/app/api/.venv/lib/python3.10/site-packages/httpx/_transports/default.py", line 232, in handle_request
2024-07-18 15:49:37 api-1         |     with map_httpcore_exceptions():
2024-07-18 15:49:37 api-1         |   File "/usr/local/lib/python3.10/contextlib.py", line 153, in __exit__
2024-07-18 15:49:37 api-1         |     self.gen.throw(typ, value, traceback)
2024-07-18 15:49:37 api-1         |   File "/app/api/.venv/lib/python3.10/site-packages/httpx/_transports/default.py", line 86, in map_httpcore_exceptions
2024-07-18 15:49:37 api-1         |     raise mapped_exc(message) from exc
2024-07-18 15:49:37 api-1         | httpx.ReadTimeout: timed out
2024-07-18 15:49:37 api-1         | 
2024-07-18 15:49:37 api-1         | The above exception was the direct cause of the following exception:
2024-07-18 15:49:37 api-1         | 
2024-07-18 15:49:37 api-1         | Traceback (most recent call last):
2024-07-18 15:49:37 api-1         |   File "/app/api/core/model_runtime/model_providers/__base/large_language_model.py", line 102, in invoke
2024-07-18 15:49:37 nginx-1       | 172.18.0.1 - - [18/Jul/2024:07:49:37 +0000] "GET /console/api/datasets/4ee1992d-50d9-4129-ad86-3536305a10c1/batch/20240718074530132385/indexing-estimate HTTP/1.1" 500 115 "http://localhost/datasets/4ee1992d-50d9-4129-ad86-3536305a10c1/documents/create" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/126.0.0.0 Safari/537.36" "-"
2024-07-18 15:49:37 api-1         |     result = self._invoke(model, credentials, prompt_messages, model_parameters, tools, stop, stream, user)
2024-07-18 15:49:37 api-1         |   File "/app/api/core/model_runtime/model_providers/xinference/llm/llm.py", line 83, in _invoke
2024-07-18 15:49:37 api-1         |     return self._generate(
2024-07-18 15:49:37 api-1         |   File "/app/api/core/model_runtime/model_providers/xinference/llm/llm.py", line 489, in _generate
2024-07-18 15:49:37 api-1         |     resp = client.chat.completions.create(
2024-07-18 15:49:37 api-1         |   File "/app/api/.venv/lib/python3.10/site-packages/openai/_utils/_utils.py", line 277, in wrapper
2024-07-18 15:49:37 api-1         |     return func(*args, **kwargs)
2024-07-18 15:49:37 api-1         |   File "/app/api/.venv/lib/python3.10/site-packages/openai/resources/chat/completions.py", line 590, in create
2024-07-18 15:49:37 api-1         |     return self._post(
2024-07-18 15:49:37 api-1         |   File "/app/api/.venv/lib/python3.10/site-packages/openai/_base_client.py", line 1240, in post
2024-07-18 15:49:37 api-1         |     return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
2024-07-18 15:49:37 api-1         |   File "/app/api/.venv/lib/python3.10/site-packages/openai/_base_client.py", line 921, in request
2024-07-18 15:49:37 api-1         |     return self._request(
2024-07-18 15:49:37 api-1         |   File "/app/api/.venv/lib/python3.10/site-packages/openai/_base_client.py", line 961, in _request
2024-07-18 15:49:37 api-1         |     return self._retry_request(
2024-07-18 15:49:37 api-1         |   File "/app/api/.venv/lib/python3.10/site-packages/openai/_base_client.py", line 1053, in _retry_request
2024-07-18 15:49:37 api-1         |     return self._request(
2024-07-18 15:49:37 api-1         |   File "/app/api/.venv/lib/python3.10/site-packages/openai/_base_client.py", line 961, in _request
2024-07-18 15:49:37 api-1         |     return self._retry_request(
2024-07-18 15:49:37 api-1         |   File "/app/api/.venv/lib/python3.10/site-packages/openai/_base_client.py", line 1053, in _retry_request
2024-07-18 15:49:37 api-1         |     return self._request(
2024-07-18 15:49:37 api-1         |   File "/app/api/.venv/lib/python3.10/site-packages/openai/_base_client.py", line 961, in _request
2024-07-18 15:49:37 api-1         |     return self._retry_request(
2024-07-18 15:49:37 api-1         |   File "/app/api/.venv/lib/python3.10/site-packages/openai/_base_client.py", line 1053, in _retry_request
2024-07-18 15:49:37 api-1         |     return self._request(
2024-07-18 15:49:37 api-1         |   File "/app/api/.venv/lib/python3.10/site-packages/openai/_base_client.py", line 971, in _request
2024-07-18 15:49:37 api-1         |     raise APITimeoutError(request=request) from err
2024-07-18 15:49:37 api-1         | openai.APITimeoutError: Request timed out.
2024-07-18 15:49:37 api-1         | 
2024-07-18 15:49:37 api-1         | During handling of the above exception, another exception occurred:
2024-07-18 15:49:37 api-1         | 
2024-07-18 15:49:37 api-1         | Traceback (most recent call last):
2024-07-18 15:49:37 api-1         |   File "/app/api/controllers/console/datasets/datasets_document.py", line 495, in get
2024-07-18 15:49:37 api-1         |     response = indexing_runner.indexing_estimate(current_user.current_tenant_id, extract_settings,
2024-07-18 15:49:37 api-1         |   File "/app/api/core/indexing_runner.py", line 304, in indexing_estimate
2024-07-18 15:49:37 api-1         |     response = LLMGenerator.generate_qa_document(current_user.current_tenant_id, preview_texts[0],
2024-07-18 15:49:37 api-1         |   File "/app/api/core/llm_generator/llm_generator.py", line 185, in generate_qa_document
2024-07-18 15:49:37 api-1         |     response = model_instance.invoke_llm(
2024-07-18 15:49:37 api-1         |   File "/app/api/core/model_manager.py", line 123, in invoke_llm
2024-07-18 15:49:37 api-1         |     return self._round_robin_invoke(
2024-07-18 15:49:37 api-1         |   File "/app/api/core/model_manager.py", line 302, in _round_robin_invoke
2024-07-18 15:49:37 api-1         |     return function(*args, **kwargs)
2024-07-18 15:49:37 api-1         |   File "/app/api/core/model_runtime/model_providers/__base/large_language_model.py", line 117, in invoke
2024-07-18 15:49:37 api-1         |     raise self._transform_invoke_error(e)
2024-07-18 15:49:37 api-1         | core.model_runtime.errors.invoke.InvokeConnectionError: [xinference] Connection Error, Request timed out.
2024-07-18 15:49:37 api-1         | 
2024-07-18 15:49:37 api-1         | During handling of the above exception, another exception occurred:
2024-07-18 15:49:37 api-1         | 
2024-07-18 15:49:37 api-1         | Traceback (most recent call last):
2024-07-18 15:49:37 api-1         |   File "/app/api/.venv/lib/python3.10/site-packages/flask/app.py", line 880, in full_dispatch_request
2024-07-18 15:49:37 api-1         |     rv = self.dispatch_request()
2024-07-18 15:49:37 api-1         |   File "/app/api/.venv/lib/python3.10/site-packages/flask/app.py", line 865, in dispatch_request
2024-07-18 15:49:37 api-1         |     return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)  # type: ignore[no-any-return]
2024-07-18 15:49:37 api-1         |   File "/app/api/.venv/lib/python3.10/site-packages/flask_restful/__init__.py", line 489, in wrapper
2024-07-18 15:49:37 api-1         |     resp = resource(*args, **kwargs)
2024-07-18 15:49:37 api-1         |   File "/app/api/.venv/lib/python3.10/site-packages/flask/views.py", line 110, in view
2024-07-18 15:49:37 api-1         |     return current_app.ensure_sync(self.dispatch_request)(**kwargs)  # type: ignore[no-any-return]
2024-07-18 15:49:37 api-1         |   File "/app/api/.venv/lib/python3.10/site-packages/flask_restful/__init__.py", line 604, in dispatch_request
2024-07-18 15:49:37 api-1         |     resp = meth(*args, **kwargs)
2024-07-18 15:49:37 api-1         |   File "/app/api/controllers/console/setup.py", line 74, in decorated
2024-07-18 15:49:37 api-1         |     return view(*args, **kwargs)
2024-07-18 15:49:37 api-1         |   File "/app/api/libs/login.py", line 91, in decorated_view
2024-07-18 15:49:37 api-1         |     return current_app.ensure_sync(func)(*args, **kwargs)
2024-07-18 15:49:37 api-1         |   File "/app/api/controllers/console/wraps.py", line 21, in decorated
2024-07-18 15:49:37 api-1         |     return view(*args, **kwargs)
2024-07-18 15:49:37 api-1         |   File "/app/api/controllers/console/datasets/datasets_document.py", line 505, in get
2024-07-18 15:49:37 api-1         |     raise IndexingEstimateError(str(e))
2024-07-18 15:49:37 api-1         | controllers.console.datasets.error.IndexingEstimateError: 500 Internal Server Error: [xinference] Connection Error, Request timed out.
2024-07-18 15:49:38 worker-1      | [2024-07-18 07:49:38,114: ERROR/MainProcess] [xinference] Connection Error, Request timed out.
2024-07-18 15:49:38 worker-1      | Traceback (most recent call last):
2024-07-18 15:49:38 worker-1      |   File "/app/api/.venv/lib/python3.10/site-packages/httpx/_transports/default.py", line 69, in map_httpcore_exceptions
2024-07-18 15:49:38 worker-1      |     yield
2024-07-18 15:49:38 worker-1      |   File "/app/api/.venv/lib/python3.10/site-packages/httpx/_transports/default.py", line 233, in handle_request
2024-07-18 15:49:38 worker-1      |     resp = self._pool.handle_request(req)
2024-07-18 15:49:38 worker-1      |   File "/app/api/.venv/lib/python3.10/site-packages/httpcore/_sync/connection_pool.py", line 216, in handle_request
2024-07-18 15:49:38 worker-1      |     raise exc from None
2024-07-18 15:49:38 worker-1      |   File "/app/api/.venv/lib/python3.10/site-packages/httpcore/_sync/connection_pool.py", line 196, in handle_request
2024-07-18 15:49:38 worker-1      |     response = connection.handle_request(
2024-07-18 15:49:38 worker-1      |   File "/app/api/.venv/lib/python3.10/site-packages/httpcore/_sync/connection.py", line 101, in handle_request
2024-07-18 15:49:38 worker-1      |     return self._connection.handle_request(request)
2024-07-18 15:49:38 worker-1      |   File "/app/api/.venv/lib/python3.10/site-packages/httpcore/_sync/http11.py", line 143, in handle_request
2024-07-18 15:49:38 worker-1      |     raise exc
2024-07-18 15:49:38 worker-1      |   File "/app/api/.venv/lib/python3.10/site-packages/httpcore/_sync/http11.py", line 113, in handle_request
2024-07-18 15:49:38 worker-1      |     ) = self._receive_response_headers(**kwargs)
2024-07-18 15:49:38 worker-1      |   File "/app/api/.venv/lib/python3.10/site-packages/httpcore/_sync/http11.py", line 186, in _receive_response_headers
2024-07-18 15:49:38 worker-1      |     event = self._receive_event(timeout=timeout)
2024-07-18 15:49:38 worker-1      |   File "/app/api/.venv/lib/python3.10/site-packages/httpcore/_sync/http11.py", line 224, in _receive_event
2024-07-18 15:49:38 worker-1      |     data = self._network_stream.read(
2024-07-18 15:49:38 worker-1      |   File "/app/api/.venv/lib/python3.10/site-packages/httpcore/_backends/sync.py", line 124, in read
2024-07-18 15:49:38 worker-1      |     with map_exceptions(exc_map):
2024-07-18 15:49:38 worker-1      |   File "/usr/local/lib/python3.10/contextlib.py", line 153, in __exit__
2024-07-18 15:49:38 worker-1      |     self.gen.throw(typ, value, traceback)
2024-07-18 15:49:38 worker-1      |   File "/app/api/.venv/lib/python3.10/site-packages/httpcore/_exceptions.py", line 14, in map_exceptions
2024-07-18 15:49:38 worker-1      |     raise to_exc(exc) from exc
2024-07-18 15:49:38 worker-1      | httpcore.ReadTimeout: timed out
2024-07-18 15:49:38 worker-1      | 
2024-07-18 15:49:38 worker-1      | The above exception was the direct cause of the following exception:
2024-07-18 15:49:38 worker-1      | 
2024-07-18 15:49:38 worker-1      | Traceback (most recent call last):
2024-07-18 15:49:38 worker-1      |   File "/app/api/.venv/lib/python3.10/site-packages/openai/_base_client.py", line 952, in _request
2024-07-18 15:49:38 worker-1      |     response = self._client.send(
2024-07-18 15:49:38 worker-1      |   File "/app/api/.venv/lib/python3.10/site-packages/httpx/_client.py", line 914, in send
2024-07-18 15:49:38 worker-1      |     response = self._send_handling_auth(
2024-07-18 15:49:38 worker-1      |   File "/app/api/.venv/lib/python3.10/site-packages/httpx/_client.py", line 942, in _send_handling_auth
2024-07-18 15:49:38 worker-1      |     response = self._send_handling_redirects(
2024-07-18 15:49:38 worker-1      |   File "/app/api/.venv/lib/python3.10/site-packages/httpx/_client.py", line 979, in _send_handling_redirects
2024-07-18 15:49:38 worker-1      |     response = self._send_single_request(request)
2024-07-18 15:49:38 worker-1      |   File "/app/api/.venv/lib/python3.10/site-packages/httpx/_client.py", line 1015, in _send_single_request
2024-07-18 15:49:38 worker-1      |     response = transport.handle_request(request)
2024-07-18 15:49:38 worker-1      |   File "/app/api/.venv/lib/python3.10/site-packages/httpx/_transports/default.py", line 232, in handle_request
2024-07-18 15:49:38 worker-1      |     with map_httpcore_exceptions():
2024-07-18 15:49:38 worker-1      |   File "/usr/local/lib/python3.10/contextlib.py", line 153, in __exit__
2024-07-18 15:49:38 worker-1      |     self.gen.throw(typ, value, traceback)
2024-07-18 15:49:38 worker-1      |   File "/app/api/.venv/lib/python3.10/site-packages/httpx/_transports/default.py", line 86, in map_httpcore_exceptions
2024-07-18 15:49:38 worker-1      |     raise mapped_exc(message) from exc
2024-07-18 15:49:38 worker-1      | httpx.ReadTimeout: timed out
2024-07-18 15:49:38 worker-1      | 
2024-07-18 15:49:38 worker-1      | The above exception was the direct cause of the following exception:
2024-07-18 15:49:38 worker-1      | 
2024-07-18 15:49:38 worker-1      | Traceback (most recent call last):
2024-07-18 15:49:38 worker-1      |   File "/app/api/core/model_runtime/model_providers/__base/large_language_model.py", line 102, in invoke
2024-07-18 15:49:38 worker-1      |     result = self._invoke(model, credentials, prompt_messages, model_parameters, tools, stop, stream, user)
2024-07-18 15:49:38 worker-1      |   File "/app/api/core/model_runtime/model_providers/xinference/llm/llm.py", line 83, in _invoke
2024-07-18 15:49:38 worker-1      |     return self._generate(
2024-07-18 15:49:38 worker-1      |   File "/app/api/core/model_runtime/model_providers/xinference/llm/llm.py", line 489, in _generate
2024-07-18 15:49:38 worker-1      |     resp = client.chat.completions.create(
2024-07-18 15:49:38 worker-1      |   File "/app/api/.venv/lib/python3.10/site-packages/openai/_utils/_utils.py", line 277, in wrapper
2024-07-18 15:49:38 worker-1      |     return func(*args, **kwargs)
2024-07-18 15:49:38 worker-1      |   File "/app/api/.venv/lib/python3.10/site-packages/openai/resources/chat/completions.py", line 590, in create
2024-07-18 15:49:38 worker-1      |     return self._post(
2024-07-18 15:49:38 worker-1      |   File "/app/api/.venv/lib/python3.10/site-packages/openai/_base_client.py", line 1240, in post
2024-07-18 15:49:38 worker-1      |     return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
2024-07-18 15:49:38 worker-1      |   File "/app/api/.venv/lib/python3.10/site-packages/openai/_base_client.py", line 921, in request
2024-07-18 15:49:38 worker-1      |     return self._request(
2024-07-18 15:49:38 worker-1      |   File "/app/api/.venv/lib/python3.10/site-packages/openai/_base_client.py", line 961, in _request
2024-07-18 15:49:38 worker-1      |     return self._retry_request(
2024-07-18 15:49:38 worker-1      |   File "/app/api/.venv/lib/python3.10/site-packages/openai/_base_client.py", line 1053, in _retry_request
2024-07-18 15:49:38 worker-1      |     return self._request(
2024-07-18 15:49:38 worker-1      |   File "/app/api/.venv/lib/python3.10/site-packages/openai/_base_client.py", line 961, in _request
2024-07-18 15:49:38 worker-1      |     return self._retry_request(
2024-07-18 15:49:38 worker-1      |   File "/app/api/.venv/lib/python3.10/site-packages/openai/_base_client.py", line 1053, in _retry_request
2024-07-18 15:49:38 worker-1      |     return self._request(
2024-07-18 15:49:38 worker-1      |   File "/app/api/.venv/lib/python3.10/site-packages/openai/_base_client.py", line 961, in _request
2024-07-18 15:49:38 worker-1      |     return self._retry_request(
2024-07-18 15:49:38 worker-1      |   File "/app/api/.venv/lib/python3.10/site-packages/openai/_base_client.py", line 1053, in _retry_request
2024-07-18 15:49:38 worker-1      |     return self._request(
2024-07-18 15:49:38 worker-1      |   File "/app/api/.venv/lib/python3.10/site-packages/openai/_base_client.py", line 971, in _request
2024-07-18 15:49:38 worker-1      |     raise APITimeoutError(request=request) from err
2024-07-18 15:49:38 worker-1      | openai.APITimeoutError: Request timed out.
2024-07-18 15:49:38 worker-1      | 
2024-07-18 15:49:38 worker-1      | During handling of the above exception, another exception occurred:
2024-07-18 15:49:38 worker-1      | 
2024-07-18 15:49:38 worker-1      | Traceback (most recent call last):
2024-07-18 15:49:38 worker-1      |   File "/app/api/core/rag/index_processor/processor/qa_index_processor.py", line 133, in _format_qa_document
2024-07-18 15:49:38 worker-1      |     response = LLMGenerator.generate_qa_document(tenant_id, document_node.page_content, document_language)
2024-07-18 15:49:38 worker-1      |   File "/app/api/core/llm_generator/llm_generator.py", line 185, in generate_qa_document
2024-07-18 15:49:38 worker-1      |     response = model_instance.invoke_llm(
2024-07-18 15:49:38 worker-1      |   File "/app/api/core/model_manager.py", line 123, in invoke_llm
2024-07-18 15:49:38 worker-1      |     return self._round_robin_invoke(
2024-07-18 15:49:38 worker-1      |   File "/app/api/core/model_manager.py", line 302, in _round_robin_invoke
2024-07-18 15:49:38 worker-1      |     return function(*args, **kwargs)
2024-07-18 15:49:38 worker-1      |   File "/app/api/core/model_runtime/model_providers/__base/large_language_model.py", line 117, in invoke
2024-07-18 15:49:38 worker-1      |     raise self._transform_invoke_error(e)
2024-07-18 15:49:38 worker-1      | core.model_runtime.errors.invoke.InvokeConnectionError: [xinference] Connection Error, Request timed out.
2024-07-18 15:49:38 worker-1      | [2024-07-18 07:49:38,185: INFO/MainProcess] Processed dataset: 4ee1992d-50d9-4129-ad86-3536305a10c1 latency: 247.22300854599962

Jul 18 '24 08:07 Surrin1999

Here is the model settings

Jul 18 '24 08:07 Surrin1999

Hello, @Surrin1999! I'm here to help you with any bugs, questions, and contributions. Let me know how I can assist you.

To resolve the HTTP timeout issues in Dify version 0.6.14 when using QA mode with Xinference, despite setting GUNICORN_TIMEOUT=3600, follow these steps:

Verify GUNICORN_TIMEOUT Configuration: Ensure that GUNICORN_TIMEOUT is correctly set to 3600 in your .env file and referenced in your docker-compose.yaml file:
```
GUNICORN_TIMEOUT=3600
```
```
x-shared-env: &shared-api-worker-env
  GUNICORN_TIMEOUT: ${GUNICORN_TIMEOUT}
```
Adjust HTTP Request Timeout in xinference_helper.py: The timeout for HTTP requests in xinference_helper.py is set to 10 seconds. Increase this timeout to better suit your needs:
```
response = session.get(url, timeout=60)  # Increase timeout to 60 seconds
```
This change should be made in the _get_xinference_extra_parameter method of the xinference_helper.py file ^[1].

By ensuring both the GUNICORN_TIMEOUT and the HTTP request timeout in xinference_helper.py are appropriately configured, you should be able to resolve the timeout issues.

To continue talking to Dosu, mention @dosu.

Jul 18 '24 08:07 dosubot[bot]

Hello, @Surrin1999! I'm here to help you with any bugs, questions, and contributions. Let me know how I can assist you.

To resolve the HTTP timeout issues in Dify version 0.6.14 when using QA mode with Xinference, despite setting GUNICORN_TIMEOUT=3600, follow these steps:
Verify GUNICORN_TIMEOUT Configuration: Ensure that GUNICORN_TIMEOUT is correctly set to 3600 in your .env file and referenced in your docker-compose.yaml file:
GUNICORN_TIMEOUT=3600
x-shared-env: &shared-api-worker-env
  GUNICORN_TIMEOUT: ${GUNICORN_TIMEOUT}
Adjust HTTP Request Timeout in xinference_helper.py: The timeout for HTTP requests in xinference_helper.py is set to 10 seconds. Increase this timeout to better suit your needs:
response = session.get(url, timeout=60)  # Increase timeout to 60 seconds
This change should be made in the _get_xinference_extra_parameter method of the xinference_helper.py file [1].
By ensuring both the GUNICORN_TIMEOUT and the HTTP request timeout in xinference_helper.py are appropriately configured, you should be able to resolve the timeout issues.

To continue talking to Dosu, mention @dosu.

I have tried everything you said, and it doesn't work. @dosubot @dosu

Jul 18 '24 08:07 Surrin1999

If you've already tried setting GUNICORN_TIMEOUT to 3600 and increasing the HTTP request timeout in xinference_helper.py to 60 seconds without success, consider the following additional steps:

Increase Timeout in httpx Configuration: Ensure that the timeout settings are applied globally in your httpx client configuration. You can set a default timeout for all requests:
```
import httpx

client = httpx.Client(timeout=httpx.Timeout(60.0))
```
Check for Other Timeout Settings: Verify if there are other timeout settings in your application that might be causing the issue. For example, check if there are any timeout settings in your web server configuration (e.g., Nginx) or other parts of your application.
Review Docker Resource Limits: Ensure that your Docker environment has sufficient resources allocated. You can specify resource limits in your docker-compose.yaml file:
```
services:
  api:
    deploy:
      resources:
        limits:
          cpus: '2'
          memory: '8g'
```
Increase Timeout in openai Configuration: If you are using the openai library, ensure that the timeout is set appropriately:
```
import openai

openai.api_request_timeout = 60
```
Network Configuration: Ensure that there are no network issues or firewalls that might be causing the timeouts. Check the connectivity between your Docker containers and external services.
Logs and Monitoring: Enable detailed logging and monitoring to identify where the timeout is occurring. This can help pinpoint the exact cause of the issue.

By applying these additional steps, you should be able to further diagnose and resolve the HTTP timeout issues in your Dify setup.

To continue talking to Dosu, mention @dosu.

Jul 18 '24 11:07 dosubot[bot]

If you've already tried setting GUNICORN_TIMEOUT to 3600 and increasing the HTTP request timeout in xinference_helper.py to 60 seconds without success, consider the following additional steps:
Increase Timeout in httpx Configuration: Ensure that the timeout settings are applied globally in your httpx client configuration. You can set a default timeout for all requests:
import httpx

client = httpx.Client(timeout=httpx.Timeout(60.0))
Check for Other Timeout Settings: Verify if there are other timeout settings in your application that might be causing the issue. For example, check if there are any timeout settings in your web server configuration (e.g., Nginx) or other parts of your application.
Review Docker Resource Limits: Ensure that your Docker environment has sufficient resources allocated. You can specify resource limits in your docker-compose.yaml file:
services:
  api:
    deploy:
      resources:
        limits:
          cpus: '2'
          memory: '8g'
Increase Timeout in openai Configuration: If you are using the openai library, ensure that the timeout is set appropriately:
import openai

openai.api_request_timeout = 60
Network Configuration: Ensure that there are no network issues or firewalls that might be causing the timeouts. Check the connectivity between your Docker containers and external services.

Logs and Monitoring: Enable detailed logging and monitoring to identify where the timeout is occurring. This can help pinpoint the exact cause of the issue.
By applying these additional steps, you should be able to further diagnose and resolve the HTTP timeout issues in your Dify setup.

To continue talking to Dosu, mention @dosu.

I don't think these are the right solutions, I have used the same model settings on our company's servers without any problems, the only difference is that the company's graphics card is a RTX 4090

Jul 18 '24 15:07 Surrin1999

The timeout issue persists despite increasing the HTTP request timeout in xinference_helper.py to 60 seconds because the timeout value in the code is still set to 10 seconds. You need to update the timeout value in the session.get call to 60 seconds.

Here is the relevant part of the code with the updated timeout value:

try:
    response = session.get(url, timeout=60)  # Update timeout to 60 seconds
except (MissingSchema, ConnectionError, Timeout) as e:
    raise RuntimeError(f'get xinference model extra parameter failed, url: {url}, error: {e}')

Make sure to replace the existing timeout value in the xinference_helper.py file with the updated value as shown above ^[1].

To continue talking to Dosu, mention @dosu.

Jul 18 '24 15:07 dosubot[bot]

Can someone help me?

Jul 19 '24 07:07 Surrin1999

Could you try the latest version. v0.6.15

Jul 24 '24 11:07 crazywoola

How to use the Q&A mode, I can not find the Q&A mode in the page where create a new knowledge base.

Aug 06 '24 13:08 tarrett

perhaps you also need to set your nginx timeout value and RTX 4060 may not enough to genrate QA by LLM , it should be very slow...

Aug 07 '24 12:08 JohnJyong