Self-hosted runner stuck on "Waiting for a runner to pick up this job..." in multi-step jobs
Describe the bug Using a self-hosted runner. When a GitHub action has multiple steps the first will run successfully but then the subsequent step is stuck in queued status with a logging message "Waiting for a runner to pick up this job..."
I am able to get the stuck jobs to start by either cancelling and re-running the job via the GitHub UI or by restarting the GitHub Runner service within our EC2. In both instances the job immediately picks up and runs successfully.
To Reproduce Steps to reproduce the behavior:
- Use self-hosted GitHub Runner
- Use a multi-step action
- Initiate the action
- Observe that the 2nd step is stuck in queued status.
Expected behavior The runner should pick up the 2nd (and following steps)
Runner Version and Platform
v2.321.0 Windowsx64
We noticed that our machine auto-updated with this version on November 27 and then our CI runs the following week started to have this problem.
OS of the machine running the runner? Windows
What's not working?
"Waiting for a runner to pick up this job..." for up to hours.
Job Log Output
N/A the runner does not get to job output, it is stuck in queue.
I am also being affected by this, tried everything, added disableUpdate to no avail
Same here. Manually stopping and starting the service makes it move onto the next job. Obviously this is not ideal.
v2.321.0 Linux Arm64
Also affected by this issue on multiple runners running v2.321.0 on Debian 12 / amd64. We are able to workaround this issue by rebooting the runners.
I tested downgrading to v2.320.0 and encountered the same issue.
Same. I tried everything that I can. Nothing changed.
Same here. is there any work around for this issue?
I've tried almost everything I could do... but it's not working out well.
I'm also facing the same issue, have to stop the service and start again then only the next stage starts
it's not working out well.
I'm also facing the same issue, have to stop the service and start again then only the next stage starts
Windows and Linux both have same issue.
Same issue here, and my workaround is to rerun & immediately cancel some old action. This "revives" stuck jobs, but new jobs endup with the same problem again.
same issue only fixed by restart runner each step :( systemctl restart actions.runner......
For those using action machulav/ec2-github-runner, which does not use systemd service you need to kill process /actions-runner/run-helper.sh and start it again from /actions-runner/bin with ./run-helper.sh run.
Same issue here. Tried reinstalling and downgrading the runner without success.
Could you provide run url which got stuck waiting for runner? That'll help us debug.
Could you provide run url which got stuck waiting for runner? That'll help us debug.
Unfortunately this is on a private repo; is there another way I can get you additional information? I could possibly inquire with my organization about giving you temporary read-only access.
Unfortunately this is on a private repo; is there another way I can get you additional information? I could possibly inquire with my organization about giving you temporary read-only access.
Private repo run url is fine too.
I'm having the same problem, this run got stuck for whole day: https://github.com/Dasharo/meta-dts/actions/runs/12163220462. Had to restart runner so workflow would continue.
Weirdly this one https://github.com/Dasharo/meta-dts/actions/runs/12180581391 got stuck waiting on Run DTS tests but cleanup started normally after job failed.
We've been seeing the same thing here for 3 days now. First 2 jobs run, then waiting...
We are seeing this in multiple repositories, though some of them work fine. Nothing in our workflows has changed in 6 months.
Ubuntu 22.04 EC2 instances in AWS. using: machulav/ec2-github-runner@v2 also tried: machulav/[email protected] but no changes to issue.
@lokesh755 same here for self-hosted private repos.
Edit: Run url has been deleted from this post.
Thanks for reporting the issue. We've identified the root cause, which appears to be linked to a feature flag that was enabled two days ago. We've temporarily disabled the feature flag, which should resolve the issue. If you continue to experience similar problems, please let us know.
Thanks for reporting the issue. We've identified the root cause, which appears to be linked to a feature flag that was enabled two days ago. We've temporarily disabled the feature flag, which should resolve the issue. If you continue to experience similar problems, please let us know.
Perfect. Thank you. It works healthy for now.
Thanks for reporting the issue. We've identified the root cause, which appears to be linked to a feature flag that was enabled two days ago. We've temporarily disabled the feature flag, which should resolve the issue. If you continue to experience similar problems, please let us know.
Has anyone tried it and got it working normally again? Because I have done a workaround by changing multiple jobs to a single job in my workflow
Thanks for reporting the issue. We've identified the root cause, which appears to be linked to a feature flag that was enabled two days ago. We've temporarily disabled the feature flag, which should resolve the issue. If you continue to experience similar problems, please let us know.
Has anyone tried it and got it working normally again? Because I have done a workaround by changing multiple jobs to a single job in my workflow
It is fixed on my workflows. No issue at all after @lokesh755 's last message for me.
Can confirm! 👍🏿
I'm still facing this sometimes
Fixed for me as well
@SoftradixAD can you try stopping and starting the service again, that worked for me
Okay let me try, @rohitkhatri sir
Thanks for reporting the issue. We've identified the root cause, which appears to be linked to a feature flag that was enabled two days ago. We've temporarily disabled the feature flag, which should resolve the issue. If you continue to experience similar problems, please let us know.
Thank you, it is working correctly