Docker-OSX icon indicating copy to clipboard operation
Docker-OSX copied to clipboard

Checkpoint / Restore Functionality with CRIU

Open christosnc opened this issue 4 years ago • 6 comments

Hello,

I would like to say this is a great project. I managed to install macOS on Ubuntu 20.04. As others pointed out, booting is very slow (especially for CI / CD) and I would like to have a suspend functionality, in order to start right into the booted system.

I found docker's checkpoint experimental feature. To set it up, do the following:

sudo systemctl stop docker
sudo -i
sudo echo '{"experimental": true}' >> /etc/docker/daemon.json
sudo add-apt-repository ppa:criu/ppa
sudo apt-get update
sudo apt-get -y install criu
sudo service docker restart

Then I should be able to boot into macOS, and create a checkpoint with:

docker checkpoint create --leave-running=true <container-id> checkpoint-1

(I tried starting the container with and without -ai)

But the checkpoint command always fails with:

Error response from daemon: Cannot checkpoint container: runc did not terminate successfully: criu failed: type NOTIFY errno 0

Am I missing something, or is it impossible to create checkpoints for this project? (And is there any other solution that I am missing? I couldn't find anything else that suits my needs.)

Thanks!

christosnc avatar Jun 11 '21 10:06 christosnc

PS:

I also tried with and without --leave-running=true, with and without sudo, and with the short and full container Id.

Also sudo criu check --all returns "Looks good."

Looking at the criu logs the following errors are at the bottom:

(00.564986) Error (criu/proc_parse.c:453): Unknown shit 600 (anon_inode:kvm-vcpu:3)
(00.565000) Error (criu/proc_parse.c:661): Can't open 2730's mapfile link 7f5567a43000: No such device or address
(00.565009) Error (criu/cr-dump.c:1250): Collect mappings (pid: 2730) failed with -1
(00.565137) Unlock network
(00.565141) Running network-unlock scripts
(00.565144)     RPC
(00.569669) Unfreezing tasks into 1
(00.569708)     Unseizing 2568 into 1
(00.569722)     Unseizing 2729 into 1
(00.569734)     Unseizing 2738 into 1
(00.569753)     Unseizing 2730 into 1
(00.569971) Error (criu/cr-dump.c:1768): Dumping FAILED.

christosnc avatar Jun 11 '21 10:06 christosnc

Update

I also tried creating a snapshot directly in qemu with the savevm command, but this also fails with:

Error: State blocked by non-migratable CPU device (invtsc flag)

christosnc avatar Jun 12 '21 23:06 christosnc

Hey this is very interesting and I would also like to save the state of the machine. I'll take a look at this during the week

sickcodes avatar Jun 16 '21 10:06 sickcodes

That's awesome! I hope we can get something working

christosnc avatar Jun 19 '21 19:06 christosnc

I also get a similar error when dumping processes that use gpu acceleration/gpu.. Here are the logs:

(00.019628) Error (criu/proc_parse.c:467): Unknown shit 600 (anon_inode:i915.gem) (00.019645) Error (criu/proc_parse.c:694): Can't open 74009's mapfile link 7f2e13200000: No such device or address (00.019655) Error (criu/cr-dump.c:1558): Collect mappings (pid: 74009) failed with -1 (00.019768) net: Unlock network (00.019776) Unfreezing tasks into 1 (00.019779) Unseizing 74009 into 1 (00.020261) Error (criu/cr-dump.c:2093): Dumping FAILED. edit: I think its because of the gpu because it says "anon_inode:i915.gem" and i915 is a gpu driver

KernelDash avatar Jan 11 '24 17:01 KernelDash