nerdctl icon indicating copy to clipboard operation
nerdctl copied to clipboard

Retrying failed/interruped commit gives error

Open 2000yeshu opened this issue 3 years ago • 5 comments

Description

When we commit a contianer and the commit is interrupted and when we retry the commit we see the error failed to export layer: snapshot \"sha256:5ca359d74c4d65ad7dc2fc1013b3cccfa921f9000395ac846fd06a37c9a1a67e-parent-view\": already exists. I think this happens because there is no garbage collection of the bolt KVs and the error is thrown here in containerd's bolt module

Steps to reproduce the issue

  1. ctr container create docker.io/library/ubuntu:20.04 my-ubuntu
  2. sudo ctr task start my-ubuntu
  3. sudo nerdctl container exec -it my-ubuntu bash
  4. fallocate -l 50000000K test.txt
  5. sudo nerdctl commit my-ubuntu my-ubuntu-commited
  6. SIGINT(Ctrl+C)
  7. sudo nerdctl commit my-ubuntu my-ubuntu-commited.
yakul@yeshu:~$ sudo nerdctl commit my-ubuntu my-ubuntu-commited
WARN[0000] Image lacks label "nerdctl/platform", assuming the platform to be "linux/amd64" 
^C
yakul@yeshu:~$ sudo nerdctl commit my-ubuntu my-ubuntu-commited
WARN[0000] Image lacks label "nerdctl/platform", assuming the platform to be "linux/amd64" 
FATA[0000] failed to export layer: snapshot "sha256:cdca8156a203b9719f985c3114336529115cdc392f89d45cfcd37c968ddd3645-parent-view": already exists

Describe the results you received and expected

Recieved: Container stuck in PAUSED state and cannot not be committed in the second attempt.

Expected: Either the container should be successfully committed on second attempt or if not, it should fallback to RUNNING state.

What version of nerdctl are you using?

Client: Version: v0.20.0 OS/Arch: linux/amd64 Git commit: e77e05b5fd252274e3727e0439e9a2d45622ccb9

Server: containerd: Version: 1.6.6 GitCommit: 10c12954828e7c7c9b6e0ea9b0c02b01407d3ae1

What version of ctr are you using?

Client: Version: 1.6.6 Revision: 10c12954828e7c7c9b6e0ea9b0c02b01407d3ae1 Go version: go1.17.11

Are you using a variant of nerdctl? (e.g., Rancher Desktop)

No response

Host information

No response

2000yeshu avatar Oct 09 '22 17:10 2000yeshu

I can not reproduce the bug, would you mind giving us more specific reproduce steps?

Zheaoli avatar Oct 10 '22 10:10 Zheaoli

I have updated the issue description with steps I followed to reproduce the bug.

2000yeshu avatar Oct 11 '22 11:10 2000yeshu

I feel there should be signal handlers on the commit context so as to delete any garbage keys created by bolt in case a commit was interrupted/failed before completion.

2000yeshu avatar Oct 11 '22 11:10 2000yeshu

Opened PR in containerd in relation to this

2000yeshu avatar Oct 28 '22 08:10 2000yeshu