action-build icon indicating copy to clipboard operation
action-build copied to clipboard

Action never completes after successfully snapping

Open nicolasbock opened this issue 3 years ago • 13 comments

Hi,

After successfully completing and snapping the action gets stuck (for a lack of a better word) and is eventually timed out. This looks like the snapcraft process is waiting on something it is not seeing.

The latest build that exhibits this behavior can be found here:

https://github.com/nicolasbock/rabbitmq-server-snap/actions/runs/3177885875/jobs/5178819513#step:3:15178

Nick

nicolasbock avatar Oct 09 '22 00:10 nicolasbock

Does it build fine without the action?

sergiusens avatar Oct 10 '22 18:10 sergiusens

Yes, locally I can build the snap in a VM or using LXD without such issues.

nicolasbock avatar Oct 10 '22 18:10 nicolasbock

Having the same (kind of, logs stop in the middle of the build) problem at https://github.com/telegramdesktop/tdesktop after GitHub runners have updated 20230206.1 -> 20230217.1. The repo has action calling snapcraft manually, but here I tried to switch to action-build, sadly it didn't help.

ilya-fedin avatar Mar 04 '23 13:03 ilya-fedin

Hi @ilya-fedin, I took a quick look and this looks more like running out of memory or the build taking too long.

The original issue was because of docker being provisioned on 22.04 which caused snapcraft to stall and do nothing as it did not detect any network. This was fixed in the action with some iptables rules.

sergiusens avatar Mar 06 '23 22:03 sergiusens

@sergiusens if it runs out of memory, shouldn't the action stop? As for build taking too long, it was taking around 3 hours with 20230206.1 runner and now since 20230217.1 it times out after 6 hours, I don't really believe something can slow down it that much, it more looks like the connection with lxd container hangs after some time.

ilya-fedin avatar Mar 06 '23 22:03 ilya-fedin

@ilya-fedin I found that explained on the runner-images repo https://github.com/actions/runner-images/issues/1918

There's a way to get a login shell into the runner to figure out what's happening, I don't have an instruction set handy, but I am certain @mr-cal does

sergiusens avatar Mar 06 '23 23:03 sergiusens

I also have to walk back my comments on what this issue was about after reading the title and I cannot see the original linked log anymore :-(

sergiusens avatar Mar 07 '23 00:03 sergiusens

The original issue was because of docker being provisioned on 22.04 which caused snapcraft to stall and do nothing as it did not detect any network. This was fixed in the action with some iptables rules.

Could you point me to where this is fixed @sergiusens ? My workflows are running on 20.04, not 22.04.

Since the old logs are gone, I restarted that exact workflow here:

https://github.com/nicolasbock/rabbitmq-server-snap/actions/runs/4348662023

nicolasbock avatar Mar 07 '23 00:03 nicolasbock

I found that explained on the runner-images repo actions/runner-images#1918

The description and comments make me feel like in that case this error was in actions not hitting timeout, but in my case the action seem to hit timeout (due to snapcraft hang)

изображение

ilya-fedin avatar Mar 07 '23 00:03 ilya-fedin

After successfully completing and snapping the action gets stuck (for a lack of a better word) and is eventually timed out. This looks like the snapcraft process is waiting on something it is not seeing.

@ilya-fedin no, the action times out in my case.

nicolasbock avatar Mar 07 '23 00:03 nicolasbock

@nicolasbock I talk about the linked issue in the quote

ilya-fedin avatar Mar 07 '23 00:03 ilya-fedin

Ah, sorry, I misunderstood.

nicolasbock avatar Mar 07 '23 00:03 nicolasbock

Downgrading action to Ubuntu 20.04 helps me

ilya-fedin avatar Apr 01 '23 13:04 ilya-fedin