multipass icon indicating copy to clipboard operation
multipass copied to clipboard

Cant create/start vms after upgrade to 1.12.0 on mac

Open upngo opened this issue 2 years ago • 8 comments

After upgrading to multipass 1.12.0 I could no longer start or create vms. It would timeout and the last thing in the logs was Waiting for SSH to be up After reading about all the firewall issues I tried adding everything in multipass/bin to the firewall 'accept incoming' list, killing ssh sudo launchctl stop com.openssh.sshd, reinstalling and restarting. After all the above I still got timeout however the logs progressed a little further, Reporting

[2023-07-03T08:24:10.952] [debug] [new-sunbird] QMP: {"timestamp": {"seconds": 1688329450, "microseconds": 952046}, "event": "NIC_RX_FILTER_CHANGED", "data": {"path": "/machine/unattached/device[6]/virtio-backend"}}

and then hanging. I'm on a managed mac so i can't disable the firewall completely.

I uninstalled, reverted back to 1.11.1 and restarted computer and things are working.

To Reproduce

Install multipass 1.12.0 on arm mac and multipass launch

Expected behavior

Successful VM creation.

Logs

user@computer ~ % multipass launch -vvvv
[2023-07-03T08:23:53.863] [trace] [url downloader] Found https://codeload.github.com/canonical/multipass-blueprints/zip/refs/heads/main in cache: true
[2023-07-03T08:23:53.865] [debug] [blueprint provider] Loading "anbox-cloud-appliance" v1
[2023-07-03T08:23:53.865] [debug] [blueprint provider] Loading "charm-dev" v1
[2023-07-03T08:23:53.866] [debug] [blueprint provider] Loading "docker" v1
[2023-07-03T08:23:53.866] [debug] [blueprint provider] Loading "jellyfin" v1
[2023-07-03T08:23:53.867] [debug] [blueprint provider] Loading "minikube" v1
[2023-07-03T08:23:53.867] [debug] [blueprint provider] Loading "ros-noetic" v1
[2023-07-03T08:23:53.868] [debug] [blueprint provider] Loading "ros2-humble" v1
[2023-07-03T08:23:55.136] [trace] [url downloader] Found https://cloud-images.ubuntu.com/releases/streams/v1/index.json in cache: false
[2023-07-03T08:23:55.175] [trace] [url downloader] Found https://cloud-images.ubuntu.com/releases/streams/v1/com.ubuntu.cloud:released:download.json in cache: true
[2023-07-03T08:23:56.469] [trace] [url downloader] Found https://cloud-images.ubuntu.com/buildd/daily/streams/v1/index.json in cache: false
[2023-07-03T08:23:58.165] [trace] [url downloader] Found https://cloud-images.ubuntu.com/buildd/daily/streams/v1/com.ubuntu.cloud:daily:download.json in cache: false
[2023-07-03T08:23:58.177] [trace] [url downloader] Found https://cdimage.ubuntu.com/ubuntu-core/appliances/streams/v1/index.json in cache: true
[2023-07-03T08:23:58.178] [trace] [url downloader] Found https://cdimage.ubuntu.com/ubuntu-core/appliances/streams/v1/com.ubuntu.appliances:released:download.json in cache: true
[2023-07-03T08:23:58.178] [info] [VMImageHost] Did not find any supported products in "appliance"
[2023-07-03T08:23:58.185] [debug] [qemu-system-aarch64] [3724] started: qemu-system-aarch64 --version
[2023-07-03T08:23:58.212] [debug] [qemu-img] [3725] started: qemu-img info /var/root/Library/Caches/multipassd/qemu/vault/images/jammy-20230616/ubuntu-22.04-server-cloudimg-arm64.img
[2023-07-03T08:23:58.229] [debug] [qemu-img] [3726] started: qemu-img resize /var/root/Library/Application Support/multipassd/qemu/vault/instances/new-sunbird/ubuntu-22.04-server-cloudimg-arm64.img 5368709120
[2023-07-03T08:23:58.240] [debug] [qemu-img] [3727] started: qemu-img snapshot -l /var/root/Library/Application Support/multipassd/qemu/vault/instances/new-sunbird/ubuntu-22.04-server-cloudimg-arm64.img
[2023-07-03T08:23:58.246] [debug] [new-sunbird] process working dir ''
[2023-07-03T08:23:58.246] [info] [new-sunbird] process program 'qemu-system-aarch64'
[2023-07-03T08:23:58.246] [info] [new-sunbird] process arguments '-machine, virt,gic-version=3, -accel, hvf, -drive, file=/Library/Application Support/com.canonical.multipass/bin/../Resources/qemu/edk2-aarch64-code.fd,if=pflash,format=raw,readonly=on, -cpu, host, -nic, vmnet-shared,model=virtio-net-pci,mac=52:54:00:91:69:c5, -device, virtio-scsi-pci,id=scsi0, -drive, file=/var/root/Library/Application Support/multipassd/qemu/vault/instances/new-sunbird/ubuntu-22.04-server-cloudimg-arm64.img,if=none,format=qcow2,discard=unmap,id=hda, -device, scsi-hd,drive=hda,bus=scsi0.0, -smp, 1, -m, 1024M, -qmp, stdio, -chardev, null,id=char0, -serial, chardev:char0, -nographic, -cdrom, /var/root/Library/Application Support/multipassd/qemu/vault/instances/new-sunbird/cloud-init-config.iso'
[2023-07-03T08:23:58.250] [debug] [qemu-system-aarch64] [3728] started: qemu-system-aarch64 -machine virt,gic-version=3 -nographic -dump-vmstate /private/var/folders/zz/zyxvpxvq6csfxvn_n0000000000000/T/multipassd.dKtMlA
[2023-07-03T08:23:58.282] [info] [new-sunbird] process state changed to Starting
[2023-07-03T08:23:58.284] [info] [new-sunbird] process state changed to Running
[2023-07-03T08:23:58.284] [debug] [qemu-system-aarch64] [3729] started: qemu-system-aarch64 -machine virt,gic-version=3 -accel hvf -drive file=/Library/Application Support/com.canonical.multipass/bin/../Resources/qemu/edk2-aarch64-code.fd,if=pflash,format=raw,readonly=on -cpu host -nic vmnet-shared,model=virtio-net-pci,mac=52:54:00:91:69:c5 -device virtio-scsi-pci,id=scsi0 -drive file=/var/root/Library/Application Support/multipassd/qemu/vault/instances/new-sunbird/ubuntu-22.04-server-cloudimg-arm64.img,if=none,format=qcow2,discard=unmap,id=hda -device scsi-hd,drive=hda,bus=scsi0.0 -smp 1 -m 1024M -qmp stdio -chardev null,id=char0 -serial chardev:char0 -nographic -cdrom /var/root/Library/Application Support/multipassd/qemu/vault/instances/new-sunbird/cloud-init-config.iso
[2023-07-03T08:23:58.284] [info] [new-sunbird] process started
launch failed: The following errors occurred:
new-sunbird: timed out waiting for response
user@computer ~ % tail /Library/logs/Multipass/multipassd.log 
[2023-07-03T08:23:58.284] [info] [new-sunbird] process state changed to Running
[2023-07-03T08:23:58.284] [debug] [qemu-system-aarch64] [3729] started: qemu-system-aarch64 -machine virt,gic-version=3 -accel hvf -drive file=/Library/Application Support/com.canonical.multipass/bin/../Resources/qemu/edk2-aarch64-code.fd,if=pflash,format=raw,readonly=on -cpu host -nic vmnet-shared,model=virtio-net-pci,mac=52:54:00:91:69:c5 -device virtio-scsi-pci,id=scsi0 -drive file=/var/root/Library/Application Support/multipassd/qemu/vault/instances/new-sunbird/ubuntu-22.04-server-cloudimg-arm64.img,if=none,format=qcow2,discard=unmap,id=hda -device scsi-hd,drive=hda,bus=scsi0.0 -smp 1 -m 1024M -qmp stdio -chardev null,id=char0 -serial chardev:char0 -nographic -cdrom /var/root/Library/Application Support/multipassd/qemu/vault/instances/new-sunbird/cloud-init-config.iso
[2023-07-03T08:23:58.284] [info] [new-sunbird] process started
[2023-07-03T08:23:58.284] [debug] [new-sunbird] Waiting for SSH to be up
[2023-07-03T08:23:58.310] [debug] [new-sunbird] QMP: {"QMP": {"version": {"qemu": {"micro": 0, "minor": 0, "major": 8}, "package": ""}, "capabilities": ["oob"]}}

[2023-07-03T08:23:58.328] [debug] [new-sunbird] QMP: {"return": {}}

[2023-07-03T08:24:10.952] [debug] [new-sunbird] QMP: {"timestamp": {"seconds": 1688329450, "microseconds": 952046}, "event": "NIC_RX_FILTER_CHANGED", "data": {"path": "/machine/unattached/device[6]/virtio-backend"}}

Additional info

  • Machine: M2 Macbook
  • OS: macos 13.4
  • Multipass 1.12.0
  • QEMU driver

Additional context

I've run through this process twice now and get the same result. The restart after downgrade is required otherwise it gets the same error. Unloading and reloading the service has no impact.

upngo avatar Jul 03 '23 00:07 upngo

Hi @upngo!

This looks like a firewall issue for which there isn't much we can do to help. But, one thing you can try is to use this package from a PR that will be in the next release. The PR includes the following changes:

Two main changes to improve the step of waiting for SSH, when starting/launching instances:

  1. Increase the time allotted for each iteration of the loop to wait for SSH, to 1 second. Log each such iteration of the loop.
  2. Retry only on specific exceptions, logging them (with trace verbosity, since this happens every second), and letting the rest through, in the hope that this may shed some light into cases where instances end up in unknown state.

so try launching an instance again with trace level verbosity and if the issue persists send us the logs.

sharder996 avatar Jul 03 '23 18:07 sharder996

Upgraded to 1.12 on macos 13.4.1 (22F82). Same issue.

UI hangs after 30s

image

mcybz avatar Jul 10 '23 16:07 mcybz

Hi @mcybz, I think this issue is a little different (timeouts when starting VMs). Yours sounds more like #3142, which causes the Multipass GUI to appear unresponsive because it cannot connect to the daemon.

ricab avatar Jul 10 '23 18:07 ricab

Hi @upngo!

This looks like a firewall issue for which there isn't much we can do to help. But, one thing you can try is to use this package from a PR that will be in the next release. The PR includes the following changes:

Two main changes to improve the step of waiting for SSH, when starting/launching instances:

  1. Increase the time allotted for each iteration of the loop to wait for SSH, to 1 second. Log each such iteration of the loop.
  2. Retry only on specific exceptions, logging them (with trace verbosity, since this happens every second), and letting the rest through, in the hope that this may shed some light into cases where instances end up in unknown state.

so try launching an instance again with trace level verbosity and if the issue persists send us the logs.

Thanks! Using the build from that PR works successfully for me and can run VMs :) so presumably 1.13 official release will too when that comes. Thanks

upngo avatar Jul 12 '23 00:07 upngo

Glad it is working for you now @upngo! And yes, those changes will be in our next release indeed.

If you get into the same situation again, feel free to reopen this issue.

ricab avatar Jul 12 '23 10:07 ricab

@ricab Unfortunately after a macos security update to 13.5.1 the same problem has resurfaced :( I also note that the build from that release is now "Access Denied" though i still have a local copy of it. Have there been any more recent builds that I could try?

Will probably have to try downgrade macos which is always a pain....

[2023-09-04T16:13:39.382] [debug] [victorious-sawfish] Waiting for SSH to be up
[2023-09-04T16:13:39.930] [debug] [victorious-sawfish] QMP: {"QMP": {"version": {"qemu": {"micro": 0, "minor": 0, "major": 8}, "package": ""}, "capabilities": ["oob"]}}

[2023-09-04T16:13:40.028] [debug] [victorious-sawfish] QMP: {"return": {}}

[2023-09-04T16:13:52.923] [debug] [victorious-sawfish] QMP: {"timestamp": {"seconds": 1693800832, "microseconds": 923051}, "event": "NIC_RX_FILTER_CHANGED", "data": {"path": "/machine/unattached/device[6]/virtio-backend"}}

upngo avatar Sep 04 '23 04:09 upngo

Hey @upngo, bummer :confused: People reported similar problems in Sonoma beta builds, but it would be strange for the security update to be causing the issue.

This package has the same improvements that were mentioned earlier. Does the situation improve with it? Would you be able to post the output of launch -vvvv with it?

ricab avatar Sep 04 '23 11:09 ricab

The link returns 404 now. I'm back to having this issue :( On a positive note when I upgraded to Sonoma I didn't have any issues. Every few weeks though my setups get upset and I have to blow everything away. Always the same error, even with firewall off.

023-10-16T14:44:23.825] [debug] [promoting-bedbug] Waiting for SSH to be up
[2023-10-16T14:44:23.854] [debug] [promoting-bedbug] QMP: {"QMP": {"version": {"qemu": {"micro": 0, "minor": 0, "major": 8}, "package": ""}, "capabilities": ["oob"]}}

[2023-10-16T14:44:23.870] [debug] [promoting-bedbug] QMP: {"return": {}}

[2023-10-16T14:44:37.065] [debug] [promoting-bedbug] QMP: {"timestamp": {"seconds": 1697420677, "microseconds": 65787}, "event": "NIC_RX_FILTER_CHANGED", "data": {"path": "/machine/unattached/device[6]/virtio-backend"}}

upngo avatar Oct 16 '23 02:10 upngo