dify icon indicating copy to clipboard operation
dify copied to clipboard

`get_paginate_workflow_app_logs` experiencing slow query

Open horochx opened this issue 11 months ago • 1 comments

Self Checks

  • [x] I have searched for existing issues search for existing issues, including closed ones.
  • [x] I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
  • [x] [FOR CHINESE USERS] 请务必使用英文提交 Issue,否则会被关闭。谢谢!:)
  • [x] Please do not modify this template :) and fill in all the required fields.

1. Is this request related to a challenge you're experiencing? Tell me about your story.

Currently, the primary slow queries in our database are originating from the get_paginate_workflow_app_logs method of the WorkflowAppService. I've noticed that there might be room for optimization in the index of the WorkflowAppLog table.

SQL:

SELECT count(*) AS count_1 
FROM (SELECT workflow_app_logs.id AS id, workflow_app_logs.tenant_id AS tenant_id, workflow_app_logs.app_id AS app_id, workflow_app_logs.workflow_id AS workflow_id, workflow_app_logs.workflow_run_id AS workflow_run_id, workflow_app_logs.created_from AS created_from, workflow_app_logs.created_by_role AS created_by_role, workflow_app_logs.created_by AS created_by, workflow_app_logs.created_at AS created_at 
FROM workflow_app_logs JOIN workflow_runs ON workflow_runs.id = workflow_app_logs.workflow_run_id LEFT OUTER JOIN end_users ON workflow_runs.created_by = end_users.id AND workflow_runs.created_by_role = 'end_user'

2. Additional context or comments

No response

3. Can you help us with this feature?

  • [x] I am interested in contributing to this feature.

horochx avatar Mar 03 '25 07:03 horochx

Hey @horochx! 👋 I'm here to help you with bug fixes, answer your questions, and guide you as a potential contributor. Let me know how I can assist you while you wait for a human maintainer.

To address the slow queries in the get_paginate_workflow_app_logs method, you can optimize the indexing strategy on the WorkflowAppLog table. Here are some suggestions:

  1. Composite Index on tenant_id and app_id: This index already exists and helps with filtering by these columns [1].

  2. Index on workflow_run_id: Ensure there is an index on workflow_run_id to optimize the join with the WorkflowRun table [2].

  3. Index on created_at: Adding an index on created_at can improve the performance of ordering operations [2].

  4. Indexes on inputs and outputs: If these columns are frequently used in ilike conditions, consider creating indexes to speed up text searches [2].

  5. Index on created_by and created_by_role: Consider indexing these columns to optimize the join with the EndUser table [2].

Implementing these indexes should help reduce the time taken to filter, join, and order the data, thus improving the performance of the get_paginate_workflow_app_logs method.

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Bug Report | Other

dosubot[bot] avatar Mar 03 '25 07:03 dosubot[bot]