conductor-community icon indicating copy to clipboard operation
conductor-community copied to clipboard

After workflow repaired task is executed two times

Open astelmashenko opened this issue 3 years ago • 3 comments

Describe the bug We notices that task is executed twice sometimes. After we enabled debug logs we found out that after WorkflowRepairService re-queued task for some reason the task was exeucted two times:

INFO  2022-07-04T07:56:38,583 147034  com.netflix.conductor.core.reconciliation.WorkflowRepairService [sweeper-thread-1]  Task 425d9c94-dc30-441b-b21b-73ccc5118829 in workflow d6e20f06-c884-4c25-81a4-4a7c0eb3827e re-queued for repairs

DEBUG 2022-07-04T07:56:42,994 151445  com.netflix.conductor.contribs.tasks.http.HttpTask  [system-task-worker-1]  Response: 200, {bills={partyAUTHOR={biId=5200737, status=OPEN}, partyUNIVERSITY={biId=5200740, status=OPEN}}}, task:425d9c94-dc30-441b-b21b-73ccc5118829

DEBUG 2022-07-04T07:56:42,994 151445  com.netflix.conductor.contribs.tasks.http.HttpTask  [system-task-worker-0]  Response: 200, {bills={partyAUTHOR={biId=5200738, status=OPEN}, partyUNIVERSITY={biId=5200739, status=OPEN}}}, task:425d9c94-dc30-441b-b21b-73ccc5118829

What does WorkflowRepairService do and do we need it at all? Why does it happen even when we have lock service? Thanks.

Details Conductor version: 3.7.2 Persistence implementation: Postgres Queue implementation: Postgres Lock: Redis

To Reproduce This happens from time-to-time, we did not find steps to reproduce

Expected behavior HTTP task must be executed only once.

astelmashenko avatar Jul 04 '22 09:07 astelmashenko

I have observed the same issue. I have tried adding redis-lock and disabled repair service. It happens when a workflow is in scheduled for too long because of unavailable workers.

sziraqui avatar Sep 07 '22 18:09 sziraqui

To reproduce, you can run a load test such that it makes your instance slow to pick up the workflows.

sziraqui avatar Sep 07 '22 18:09 sziraqui

Hello - are there any updates for this issue? We are running into this error with system under load using version 3.7.3.70.

benkiser avatar Mar 02 '23 13:03 benkiser