queue icon indicating copy to clipboard operation
queue copied to clipboard

[FIX] handle race conditions that could lead to job running twice

Open sbidoul opened this issue 2 months ago • 3 comments

Do not wait for locks and start jobs that are not in the expected state.

In a nutshell this PR replaces two SELECT FOR UPDATE by SELECT FOR UPDATE SKIP LOCKED. This is because if the job to run is already locked or not in the expected state there is no need to wait: it means the job is being executed by another worker already.

Also since there is commit between the check that the job is in enqueued state and set started, and the actual start of execution, there was window there for two workers to start the same job in some rare situations. This PR should avoid this case.

Maybe fixes #858

sbidoul avatar Dec 11 '25 12:12 sbidoul

Hi @guewen, some modules you are maintaining are being modified, check this out!

OCA-git-bot avatar Dec 11 '25 12:12 OCA-git-bot

@thomaspaulb if you have a reproducer, for #858 you may want to test this, possibly in combination with #853

sbidoul avatar Dec 11 '25 13:12 sbidoul

BTW, for readability, I think the first part of _try_perform_job (until job.lock()) should be extracted from that function. That would make the flow easier to understand.

sbidoul avatar Dec 11 '25 13:12 sbidoul