finch icon indicating copy to clipboard operation
finch copied to clipboard

Can't start previously stopped container after vm stop then vm start due to network issue

Open weikequ opened this issue 2 years ago • 2 comments

Describe the bug After running a container that publish a port to the host, a vm stop/start makes the container unable to be run. Removing the vm and reinitializing does work as intended.

Steps to reproduce

  • Initialize the vm
  • Create a container with its port published
  • Start container
  • Stop container
  • Stop finch vm
  • Start finch vm
  • Start container - fails

Expected behavior Restarting the vm has no impact on whether the user is able to stop/start a container.

Screenshots or logs

❯ finch vm init
INFO[0000] Initializing and starting Finch virtual machine...
INFO[0068] Finch virtual machine started successfully
❯ finch container create -i -t -p 8080:80 --name testcon nginx:latest
docker.io/library/nginx:latest:                                                   resolved       |++++++++++++++++++++++++++++++++++++++|
index-sha256:c54fb26749e49dc2df77c6155e8b5f0f78b781b7f0eadd96ecfabdcdfa5b1ec4:    done           |++++++++++++++++++++++++++++++++++++++|
manifest-sha256:694a280d88505d2662458145387c3b3639d3fdd85a4918edd08ede400b5f1c9a: done           |++++++++++++++++++++++++++++++++++++++|
config-sha256:299014af8ec8c127a0813ec40de7fdc304123eeb0088bc75e26515d23366b398:   done           |++++++++++++++++++++++++++++++++++++++|
layer-sha256:5be7ead2e4fb379a009fc66b62f5f0b15e64e45fefb2ab077c6d7f8778d56206:    done           |++++++++++++++++++++++++++++++++++++++|
layer-sha256:f79f8cc5c20d534298dd6317333f38b7691da6d66e063ff10699727982c852be:    done           |++++++++++++++++++++++++++++++++++++++|
layer-sha256:00e99c191d16010bb5c7cbd7b0b6be859adede10210fda5de5b6cab34796a461:    done           |++++++++++++++++++++++++++++++++++++++|
layer-sha256:8147133c71f5a66ef72cfec3e0ba755a6b9d7d6beddb441d97adb118f4f5bac2:    done           |++++++++++++++++++++++++++++++++++++++|
layer-sha256:e452a617a230564bc4bb844656409202fb82890d43f28f4a9258a8629a54071f:    done           |++++++++++++++++++++++++++++++++++++++|
layer-sha256:7c86bc24f55af41f49826418e9bb437013298934a32e5df6afa58ba0acb7205d:    done           |++++++++++++++++++++++++++++++++++++++|
elapsed: 4.6 s                                                                    total:  52.9 M (11.5 MiB/s)
a3f0a3dd83c2e527d18e159cf92600c9201a03f7758febd430f0d9972d55c00a
❯ finch container start testcon
testcon
❯ # checked browser, is successful
❯ finch container stop testcon
testcon
❯ finch vm stop
INFO[0000] Stopping existing Finch virtual machine...
INFO[0002] Finch virtual machine stopped successfully
❯ finch vm start
INFO[0000] Starting existing Finch virtual machine...
INFO[0022] Finch virtual machine started successfully
❯ finch container start testcon
FATA[0000] failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: time="2023-02-08T21:46:01Z" level=fatal msg="failed to call cni.Setup: plugin type=\"bridge\" failed (add): failed to allocate for range 0: 10.4.0.2 has been allocated to finch-a3f0a3dd83c2e527d18e159cf92600c9201a03f7758febd430f0d9972d55c00a, duplicate allocation is not allowed"
Failed to write to log, write /home/weikequ.linux/.local/share/nerdctl/1935db59/containers/finch/a3f0a3dd83c2e527d18e159cf92600c9201a03f7758febd430f0d9972d55c00a/oci-hook.createRuntime.log: file already closed: unknown
FATA[0000] exit status 1

❯ # Conversely, if we stop/remove/init the vm, it does work

❯ finch vm stop
INFO[0000] Stopping existing Finch virtual machine...
INFO[0002] Finch virtual machine stopped successfully
❯ finch vm remove
INFO[0000] Removing existing Finch virtual machine...
INFO[0000] Finch virtual machine removed successfully
❯ finch vm init
INFO[0000] Initializing and starting Finch virtual machine...
INFO[0068] Finch virtual machine started successfully
❯ finch container start testcon
testcon
❯ # checked browser, is successful

Additional context

Issues that track this upstream in nerdctl:

  • https://github.com/containerd/nerdctl/issues/665
  • https://github.com/containerd/nerdctl/issues/1259
  • https://github.com/containerd/nerdctl/issues/458

weikequ avatar Feb 14 '23 00:02 weikequ

Can also recreate in @vsiravar 's rootful mode finch.

weikequ avatar Feb 23 '23 17:02 weikequ

This issue can also be reproduced by creating a volume, steps to reproduce:

  1. finch run -d --name test-container alpine sleep infinity
  2. finch volume create test
  3. finch vm remove&init
  4. finch start test-ctr

The issue will be gone after starting the container for more than 2 times

azhouwd avatar Mar 07 '23 00:03 azhouwd

shubhum@147ddaa42911 ~ % finch container create -i -t -p 8080:80 --name testcon nginx:latest
0e4ac22e31cf509df9e6ee431a51bbfc90ccd13c0b89581090a7282173f5c003
shubhum@147ddaa42911 ~ % finch container start testcon
testcon
shubhum@147ddaa42911 ~ % finch container stop testcon
testcon
shubhum@147ddaa42911 ~ % finch vm stop
INFO[0000] Stopping existing Finch virtual machine...
INFO[0005] Finch virtual machine stopped successfully
shubhum@147ddaa42911 ~ % finch vm start
INFO[0000] Starting existing Finch virtual machine...
INFO[0032] Finch virtual machine started successfully
shubhum@147ddaa42911 ~ % finch container start testcon
testcon

issue has been fixed

Shubhranshu153 avatar Aug 20 '24 00:08 Shubhranshu153