Unable to boot ubuntu cloud images newer than jammy
Issue Description
When booting recent Ubuntu cloud images (newer than focal) with hypervisor-fw, the following error can be observed on serial:
error: could not set EFI variable `LoaderDevicePartUUID'.
error: failed to install protocols.
This error seems to come from Grub.
Depending on the image we're using, the boot process either stops here, or the kernel starts to boot and then dies with the following error:
[ 1.516877] VFS: Cannot open root device "LABEL=cloudimg-rootfs" or unknown-block(0,0): error -6
[ 1.517975] Please append a correct "root=" boot option; here are the available partitions:
[ 1.519029] fd00 3670016 vda
[ 1.519035] driver: virtio_blk
[ 1.519897] fd01 2507759 vda1 3c7403b7-1401-4d43-aead-ac9d8e794513
[ 1.519902]
[ 1.520923] fd0d 1047552 vda13 1c6623b4-4ca8-449e-94ef-488f8290dc5e
[ 1.520927]
[ 1.523314] fd0e 4096 vda14 70ff585e-91cb-421a-bc4f-8af4661b5e89
[ 1.523319]
[ 1.524615] fd0f 108544 vda15 93cdc7a9-f6d5-49fa-8bb6-30d34f7136d6
[ 1.524620]
[ 1.525686] fd10 8192 vdb
[ 1.525692] driver: virtio_blk
[ 1.527492] List of all bdev filesystems:
[ 1.527899] ext3
[ 1.527902] ext2
[ 1.528130] ext4
[ 1.528636] squashfs
[ 1.528848] vfat
[ 1.529102] fuseblk
This happens in cloud-hypervisor and also in qemu when hypervisor-fw is used. The current workaround is to use EDK2 (CLOUDHV.fd) instead of hypervisor-fw.
Steps to reproduce
CHV
cloud-hypervisor \
--kernel ./hypervisor-fw \
--disk path=oracular-server-cloudimg-amd64.raw path=/tmp/ubuntu-cloudinit.img \
--cpus boot=4 \
--memory size=1024M \
--console off \
--serial tty
Qemu
qemu-system-x86_64 -m 2048 \
-cpu host \
-enable-kvm \
-M q35 \
-drive file=oracular-server-cloudimg-amd64.raw,if=none,id=hd0,format=raw \
-device virtio-blk-pci,drive=hd0,disable-legacy=on \
-hda /tmp/ubuntu-cloudinit.img \
-smp 4 \
-kernel hypervisor-fw \
-serial stdio
Download links
- oracular-server-cloudimg-amd64.raw
- Script to generate ubuntu-cloudinit.img
It's actually affecting anything newer that Jammy (not Focal).
As far as I understand, the issue is that the initramfs is not being loaded by Grub and the kernel doesn't know how to handle the root=LABEL flag without an initramfs to mount the rootfs. Modifying the disk image to set root=/dev/vda1 in the grub config solves the issue.
Between Jammy (22.04) and Noble (24.04), the two main differences that might be related are:
- the update from grub 2.06 to grub 2.12
- the introduction of a
/boot(xbootldr) partition
Thanks for the update.
I guess we have to implement some more UEFI service features in order to make Grub 2.12+ work with rust-hypervisor-firmware.
By any chance, do you already know what is missing?
Please see #333
By any chance, do you already know what is missing?
I am certainly not a grub expert but I think this commit might be a good candidate:
commit cfbfae1aef0694b416aa199291cfef7596cdfc20
Author: Ard Biesheuvel <[email protected]>
Date: Tue May 23 17:31:45 2023 +0200
efi: Use generic EFI loader for x86_64 and i386
Switch the x86 based EFI platform builds to the generic EFI loader,
which exposes the initrd via the LoadFile2 protocol instead of the
x86-specific setup header. This will launch the Linux kernel via its EFI
stub, which performs its own initialization in the EFI boot services
context before calling ExitBootServices() and performing the bare metal
Linux boot.
Given that only Linux kernel versions v5.8 and later support this initrd
loading method, the existing x86 loader is retained as a fallback, which
will also be used for Linux kernels built without the EFI stub. In this
case, GRUB calls ExitBootServices() before entering the Linux kernel,
and all EFI related information is provided to the kernel via struct
boot_params in the setup header, as before.
Note that this means that booting EFI stub kernels older than v5.8 is
not supported even when not using an initrd at all. Also, the EFI
handover protocol, which has no basis in the UEFI specification, is not
implemented.
Signed-off-by: Ard Biesheuvel <[email protected]>
Reviewed-by: Daniel Kiper <[email protected]>
For security reasons related to secureboot, I was told that the fallback mechanism mentioned here has mostly been disabled in Ubuntu.
I investigated a bit more and found that GRUB 2.12 uses EFI_BOOT_SERVICES.InstallMultipleProtocolInterfaces() to install the protocols needed to load the initramfs. This is this the failing API call that triggers the message error: failed to install protocols.:
status = b->install_multiple_protocol_interfaces (&initrd_lf2_handle,
&load_file2_guid,
&initrd_lf2,
&device_path_guid,
&initrd_lf2_device_path,
NULL);
Which raises two issues:
-
install_multiple_protocol_interfacesis not implemented by this project and seems tricky to implement as it requires c_variadics -
lf2(EFI_LOAD_FILE2_PROTOCOL) is not implemented neither
I'm trying to make a relocatable UEFI runtime services https://github.com/retrage/rust-hypervisor-firmware/tree/relocatable-efi-runtime to provide a UEFI variable service after ExitBootServices() is called. This should fix the rootfs not found issue.
- lf2 (EFI_LOAD_FILE2_PROTOCOL) is not implemented neither
This is not needed, the protocol is provided by GRUB. You just need to provide the UEFI standard protocol handler and it is gonna work fine.
@retrage How are things looking on this? Can I do anything to help?
Any updates on this?
hello, I am getting following error when booting noble cloud img: error: could not set EFI variable `LoaderDevicePartUUID'.
error: failed to install protocols.
Press any key to continue...
Same jammy boots fine. Also I booted the same noble image with Linux KVM's own VM and it booted fine too.