Can't start previously stopped container after vm stop then vm start due to network issue
Describe the bug After running a container that publish a port to the host, a vm stop/start makes the container unable to be run. Removing the vm and reinitializing does work as intended.
Steps to reproduce
- Initialize the vm
- Create a container with its port published
- Start container
- Stop container
- Stop finch vm
- Start finch vm
- Start container - fails
Expected behavior Restarting the vm has no impact on whether the user is able to stop/start a container.
Screenshots or logs
❯ finch vm init
INFO[0000] Initializing and starting Finch virtual machine...
INFO[0068] Finch virtual machine started successfully
❯ finch container create -i -t -p 8080:80 --name testcon nginx:latest
docker.io/library/nginx:latest: resolved |++++++++++++++++++++++++++++++++++++++|
index-sha256:c54fb26749e49dc2df77c6155e8b5f0f78b781b7f0eadd96ecfabdcdfa5b1ec4: done |++++++++++++++++++++++++++++++++++++++|
manifest-sha256:694a280d88505d2662458145387c3b3639d3fdd85a4918edd08ede400b5f1c9a: done |++++++++++++++++++++++++++++++++++++++|
config-sha256:299014af8ec8c127a0813ec40de7fdc304123eeb0088bc75e26515d23366b398: done |++++++++++++++++++++++++++++++++++++++|
layer-sha256:5be7ead2e4fb379a009fc66b62f5f0b15e64e45fefb2ab077c6d7f8778d56206: done |++++++++++++++++++++++++++++++++++++++|
layer-sha256:f79f8cc5c20d534298dd6317333f38b7691da6d66e063ff10699727982c852be: done |++++++++++++++++++++++++++++++++++++++|
layer-sha256:00e99c191d16010bb5c7cbd7b0b6be859adede10210fda5de5b6cab34796a461: done |++++++++++++++++++++++++++++++++++++++|
layer-sha256:8147133c71f5a66ef72cfec3e0ba755a6b9d7d6beddb441d97adb118f4f5bac2: done |++++++++++++++++++++++++++++++++++++++|
layer-sha256:e452a617a230564bc4bb844656409202fb82890d43f28f4a9258a8629a54071f: done |++++++++++++++++++++++++++++++++++++++|
layer-sha256:7c86bc24f55af41f49826418e9bb437013298934a32e5df6afa58ba0acb7205d: done |++++++++++++++++++++++++++++++++++++++|
elapsed: 4.6 s total: 52.9 M (11.5 MiB/s)
a3f0a3dd83c2e527d18e159cf92600c9201a03f7758febd430f0d9972d55c00a
❯ finch container start testcon
testcon
❯ # checked browser, is successful
❯ finch container stop testcon
testcon
❯ finch vm stop
INFO[0000] Stopping existing Finch virtual machine...
INFO[0002] Finch virtual machine stopped successfully
❯ finch vm start
INFO[0000] Starting existing Finch virtual machine...
INFO[0022] Finch virtual machine started successfully
❯ finch container start testcon
FATA[0000] failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: time="2023-02-08T21:46:01Z" level=fatal msg="failed to call cni.Setup: plugin type=\"bridge\" failed (add): failed to allocate for range 0: 10.4.0.2 has been allocated to finch-a3f0a3dd83c2e527d18e159cf92600c9201a03f7758febd430f0d9972d55c00a, duplicate allocation is not allowed"
Failed to write to log, write /home/weikequ.linux/.local/share/nerdctl/1935db59/containers/finch/a3f0a3dd83c2e527d18e159cf92600c9201a03f7758febd430f0d9972d55c00a/oci-hook.createRuntime.log: file already closed: unknown
FATA[0000] exit status 1
❯ # Conversely, if we stop/remove/init the vm, it does work
❯ finch vm stop
INFO[0000] Stopping existing Finch virtual machine...
INFO[0002] Finch virtual machine stopped successfully
❯ finch vm remove
INFO[0000] Removing existing Finch virtual machine...
INFO[0000] Finch virtual machine removed successfully
❯ finch vm init
INFO[0000] Initializing and starting Finch virtual machine...
INFO[0068] Finch virtual machine started successfully
❯ finch container start testcon
testcon
❯ # checked browser, is successful
Additional context
Issues that track this upstream in nerdctl:
- https://github.com/containerd/nerdctl/issues/665
- https://github.com/containerd/nerdctl/issues/1259
- https://github.com/containerd/nerdctl/issues/458
Can also recreate in @vsiravar 's rootful mode finch.
This issue can also be reproduced by creating a volume, steps to reproduce:
-
finch run -d --name test-container alpine sleep infinity -
finch volume create test -
finch vm remove&init -
finch start test-ctr
The issue will be gone after starting the container for more than 2 times
shubhum@147ddaa42911 ~ % finch container create -i -t -p 8080:80 --name testcon nginx:latest
0e4ac22e31cf509df9e6ee431a51bbfc90ccd13c0b89581090a7282173f5c003
shubhum@147ddaa42911 ~ % finch container start testcon
testcon
shubhum@147ddaa42911 ~ % finch container stop testcon
testcon
shubhum@147ddaa42911 ~ % finch vm stop
INFO[0000] Stopping existing Finch virtual machine...
INFO[0005] Finch virtual machine stopped successfully
shubhum@147ddaa42911 ~ % finch vm start
INFO[0000] Starting existing Finch virtual machine...
INFO[0032] Finch virtual machine started successfully
shubhum@147ddaa42911 ~ % finch container start testcon
testcon
issue has been fixed