llm node has been in Running state for a long time and cannot be terminated
Self Checks
- [X] This is only for bug report, if you would like to ask a question, please head to Discussions.
- [X] I have searched for existing issues search for existing issues, including closed ones.
- [X] I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
- [X] [FOR CHINESE USERS] 请务必使用英文提交 Issue,否则会被关闭。谢谢!:)
- [X] Please do not modify this template :) and fill in all the required fields.
Dify version
0.8.3
Cloud or Self Hosted
Cloud, Self Hosted (Docker)
Steps to reproduce
llm node has been in Running state for a long time and cannot be terminated。
Sometimes the large model nodes do not respond and remain in a running state for a long time, and no exceptions or errors are found in the backend logs.
Sometimes llm node has already returned a response, but the node status is still running.
✔️ Expected Behavior
The status of LLM is either ending normally or indicating an error.
❌ Actual Behavior
llm node has been in Running state for a long time
The LLM node in Dify version 0.8.3 may remain in the "Running" state for a long time and not terminate due to several reasons:
-
Context Fetching: The
_fetch_contextmethod involves a generator that yieldsRunRetrieverResourceEventevents. If the context fetching process is slow or if there is an issue with the context data, it could cause the node to remain in the running state. -
Model Invocation: The
_invoke_llmmethod involves invoking the large language model and handling the result through a generator. If the model invocation takes a long time or if there is an issue with the streaming of results, it could cause the node to remain in the running state. -
Event Handling: The
_handle_invoke_resultmethod processes the results from the model invocation. If there is a delay or issue in processing the streamed chunks or final results, it could cause the node to remain in the running state.
Here is the relevant code that shows these processes:
def _fetch_context(self, node_data: LLMNodeData, variable_pool: VariablePool) -> Generator[RunEvent, None, None]:
if not node_data.context.enabled:
return
if not node_data.context.variable_selector:
return
context_value = variable_pool.get_any(node_data.context.variable_selector)
if context_value:
if isinstance(context_value, str):
yield RunRetrieverResourceEvent(retriever_resources=[], context=context_value)
elif isinstance(context_value, list):
context_str = ""
original_retriever_resource = []
for item in context_value:
if isinstance(item, str):
context_str += item + "\n"
else:
if "content" not in item:
raise ValueError(f"Invalid context structure: {item}")
context_str += item["content"] + "\n"
retriever_resource = self._convert_to_original_retriever_resource(item)
if retriever_resource:
original_retriever_resource.append(retriever_resource)
yield RunRetrieverResourceEvent(
retriever_resources=original_retriever_resource, context=context_str.strip()
)
def _invoke_llm(
self,
node_data_model: ModelConfig,
model_instance: ModelInstance,
prompt_messages: list[PromptMessage],
stop: Optional[list[str]] = None,
) -> Generator[RunEvent | ModelInvokeCompleted, None, None]:
db.session.close()
invoke_result = model_instance.invoke_llm(
prompt_messages=prompt_messages,
model_parameters=node_data_model.completion_params,
stop=stop,
stream=True,
user=self.user_id,
)
generator = self._handle_invoke_result(invoke_result=invoke_result)
usage = LLMUsage.empty_usage()
for event in generator:
yield event
if isinstance(event, ModelInvokeCompleted):
usage = event.usage
self.deduct_llm_quota(tenant_id=self.tenant_id, model_instance=model_instance, usage=usage)
def _handle_invoke_result(
self, invoke_result: LLMResult | Generator
) -> Generator[RunEvent | ModelInvokeCompleted, None, None]:
if isinstance(invoke_result, LLMResult):
return
model = None
prompt_messages: list[PromptMessage] = []
full_text = ""
usage = None
finish_reason = None
for result in invoke_result:
text = result.delta.message.content
full_text += text
yield RunStreamChunkEvent(chunk_content=text, from_variable_selector=[self.node_id, "text"])
if not model:
model = result.model
if not prompt_messages:
prompt_messages = result.prompt_messages
if not usage and result.delta.usage:
usage = result.delta.usage
if not finish_reason and result.delta.finish_reason:
finish_reason = result.delta.finish_reason
if not usage:
usage = LLMUsage.empty_usage()
yield ModelInvokeCompleted(text=full_text, usage=usage, finish_reason=finish_reason)
These methods involve generators and event handling that could potentially cause delays or issues leading to the node remaining in the "Running" state [1].
To continue talking to Dosu, mention @dosu.
Dify 0.8.3 版本中的 LLM 节点可能会由于以下几个原因长时间处于“正在运行”状态而不会终止:
- 上下文获取:该
_fetch_context方法涉及一个产生RunRetrieverResourceEvent事件的生成器。如果上下文获取过程缓慢或上下文数据存在问题,则可能导致节点保持运行状态。- 模型调用:该
_invoke_llm方法涉及调用大型语言模型并通过生成器处理结果。如果模型调用需要很长时间或结果流式传输出现问题,则可能导致节点保持运行状态。- 事件处理:该
_handle_invoke_result方法处理模型调用的结果。如果在处理流式块或最终结果时出现延迟或问题,则可能导致节点保持运行状态。以下是显示这些过程的相关代码:
def _fetch_context(self, node_data: LLMNodeData, variable_pool: VariablePool) -> Generator[RunEvent, None, None]: if not node_data.context.enabled: return if not node_data.context.variable_selector: return context_value = variable_pool.get_any(node_data.context.variable_selector) if context_value: if isinstance(context_value, str): yield RunRetrieverResourceEvent(retriever_resources=[], context=context_value) elif isinstance(context_value, list): context_str = "" original_retriever_resource = [] for item in context_value: if isinstance(item, str): context_str += item + "\n" else: if "content" not in item: raise ValueError(f"Invalid context structure: {item}") context_str += item["content"] + "\n" retriever_resource = self._convert_to_original_retriever_resource(item) if retriever_resource: original_retriever_resource.append(retriever_resource) yield RunRetrieverResourceEvent( retriever_resources=original_retriever_resource, context=context_str.strip() ) def _invoke_llm( self, node_data_model: ModelConfig, model_instance: ModelInstance, prompt_messages: list[PromptMessage], stop: Optional[list[str]] = None, ) -> Generator[RunEvent | ModelInvokeCompleted, None, None]: db.session.close() invoke_result = model_instance.invoke_llm( prompt_messages=prompt_messages, model_parameters=node_data_model.completion_params, stop=stop, stream=True, user=self.user_id, ) generator = self._handle_invoke_result(invoke_result=invoke_result) usage = LLMUsage.empty_usage() for event in generator: yield event if isinstance(event, ModelInvokeCompleted): usage = event.usage self.deduct_llm_quota(tenant_id=self.tenant_id, model_instance=model_instance, usage=usage) def _handle_invoke_result( self, invoke_result: LLMResult | Generator ) -> Generator[RunEvent | ModelInvokeCompleted, None, None]: if isinstance(invoke_result, LLMResult): return model = None prompt_messages: list[PromptMessage] = [] full_text = "" usage = None finish_reason = None for result in invoke_result: text = result.delta.message.content full_text += text yield RunStreamChunkEvent(chunk_content=text, from_variable_selector=[self.node_id, "text"]) if not model: model = result.model if not prompt_messages: prompt_messages = result.prompt_messages if not usage and result.delta.usage: usage = result.delta.usage if not finish_reason and result.delta.finish_reason: finish_reason = result.delta.finish_reason if not usage: usage = LLMUsage.empty_usage() yield ModelInvokeCompleted(text=full_text, usage=usage, finish_reason=finish_reason)这些方法涉及生成器和事件处理,可能会导致延迟或问题,导致节点保持在“运行”状态[1]。
要继续与Dosu交谈,请提及@dosu。
仍然存在 应该如何处理