Latest GRUB update breaks booting
This is a different bug compared to what is described in https://github.com/oracle/oracle-linux/issues/147
When the latest updates are applied and a server is then rebooted GRUB will not start and appears to be stuck in a busy loop displaying the following message. "error: ../../grub-core/commands/efi/tpm.c:150:unknown TPM error"
Secure Boot is disabled and no previous problems.
Steps to reproduce:
Download and install OL 9.4 x86_64 OK for first boot. Apply updates Reboot and GRUB will then fail to load with the above error message.
As a cross check a fresh install was done and grub updates were excluded with
exclude=grub*
in the /etc/dnf/dnf.conf file.
The non-grub updates were installed and the server rebooted OK.
Hello! Thanks for the report, in fact last update issued for linked issue has zero code changes, though it MIGHT have regenerated a grub config for you, maybe that is triggering the issue. Are you seeing any other errors except for unknown TPM error ? Are you using BTRFS filesystem or/and BTRFS snapshots ?
Nevermind, reproduced it, we are going to pull out this update and issue a proper one shortly.
Thank you. For info the filesystem is XFS. A minor change is the name of lvm group form "ol" to "olb" so as not to clash with the volume group name of the the previous installation on the original drive when I copy files across. I wondered if this could be relevant due to the questions about the filesystem, but from your last reply probably not. The installation is on a separate SATA drive and all other drives are disconnected.
@robertm98 once again thank you very much! I see that it is not related to filesystems, just broken grub config.
same issue here, is there any way to fix broken grub / grub.cfg from within UEFI interactive shell?
The only way I think this could be repaired is to do a recovery boot from the installation media. chroot to /mnt/sysroot (I think) then possibly use dnf to do a roll back or edit the config. @aburmash Would it be possible to get the details of the errors in the config and what needs to be done to make things good, please? What needs editing and then running to apply the config changes.
@robertm98 @m45733r i will provide recovery instructions from UEFI shell shortly.
@m45733r
- if you have already installed bad update, but did not reboot:
grub2-mkconfig > /boot/grub2/grub.cfgORgrub2-mkconfig > /boot/efi/EFI/redhat/grub.cfg - if you can only do stuff from UEFI shell.
- identify which FS is your ESP partition to do that, just check all displayed partitions one by one, ESP is usually FS0
FS0: Alias(s):HD0a1b:;BLK1:
PciRoot(0x0)/Pci(0x4,0x0)/Scsi(0x0,0x1)/HD(1,GPT,3AF7074E-C0BB-400D-8FC7-E9EC738AA53F,0x800,0x32000)
BLK0: Alias(s):
PciRoot(0x0)/Pci(0x4,0x0)/Scsi(0x0,0x1)
BLK2: Alias(s):
PciRoot(0x0)/Pci(0x4,0x0)/Scsi(0x0,0x1)/HD(2,GPT,14BE7023-6C02-4573-8891-9F639B9D936A,0x32800,0x400000)
BLK3: Alias(s):
PciRoot(0x0)/Pci(0x4,0x0)/Scsi(0x0,0x1)/HD(3,GPT,E700F071-90A5-40BB-8132-52AF688193B7,0x432800,0x5900800)****
fs0:
ls
if you see EFI dir, you are where you need to be
cd EFI/redhat
rm grub.cfg
grubx64.efi
you will be dropped to grub cmdline
ls
it will display list of disks available, there you need to find a disk that has /boot dir or identify /boot partition
run
ls <disk>/ to see which one is that
for example:
ls (hd0,gpt2)/
when you have found the /boot you will see something like
grub> ls (hd0,gpt2)/
./ ../ efi/ grub2/ loader/ vmlinuz-5.14.0-427.16.1.el9_4.x86_64
System.map-5.14.0-427.16.1.el9_4.x86_64 config-5.14.0-427.16.1.el9_4.x86_64
.vmlinuz-5.14.0-427.16.1.el9_4.x86_64.hmac
symvers-5.14.0-427.16.1.el9_4.x86_64.gz
initramfs-5.14.0-427.16.1.el9_4.x86_64.img
vmlinuz-5.15.0-206.153.7.el9uek.x86_64
System.map-5.15.0-206.153.7.el9uek.x86_64 config-5.15.0-206.153.7.el9uek.x86_64
.vmlinuz-5.15.0-206.153.7.el9uek.x86_64.hmac
symvers-5.15.0-206.153.7.el9uek.x86_64.gz
initramfs-5.15.0-206.153.7.el9uek.x86_64.img
initramfs-0-rescue-36703c3cdc50ff74e863e867384f6a8a.img
vmlinuz-0-rescue-36703c3cdc50ff74e863e867384f6a8a
initramfs-5.15.0-206.153.7.el9uek.x86_64kdump.img
Now you need to check boot info for you kernel
ls (hd0,gpt2)/loader/entries/
grub> ls (hd0,gpt2)/loader/entries/
./ ../ 8c622b7d13354f7fbe5eee50d3f340bd-5.14.0-427.16.1.el9_4.x86_64.conf
8c622b7d13354f7fbe5eee50d3f340bd-5.15.0-206.153.7.el9uek.x86_64.conf
36703c3cdc50ff74e863e867384f6a8a-0-rescue.conf
cat (hd0,gpt2)/loader/entries/8c622b7d13354f7fbe5eee50d3f340bd-5.15.0-206.153.7.el9uek.x86_64.conf You will see something like:
title Oracle Linux Server (5.15.0-206.153.7.el9uek.x86_64 with Unbreakable Ente
rprise Kernel) 9.4
version 5.15.0-206.153.7.el9uek.x86_64
linux /vmlinuz-5.15.0-206.153.7.el9uek.x86_64
initrd /initramfs-5.15.0-206.153.7.el9uek.x86_64.img $tuned_initrd
options root=/dev/mapper/ocivolume-root ro crashkernel=1G-4G:192M,4G-64G:256M,64G-:512M LANG=en_US.UTF-8 console=tty0 console=ttyS0,115200 rd.luks=0 rd.md=0 rd.dm=0 rd.lvm.vg=ocivolume rd.lvm.lv=ocivolume/root rd.net.timeout.dhcp=10 rd.net.timeout.carrier=5 netroot=iscsi:169.254.0.2:::1:iqn.2015-02.oracle.boot:uefi rd.iscsi.param=node.session.timeo.replacement_timeout=6000 net.ifnames=1 nvme_core.shutdown_timeout=10 ipmi_si.tryacpi=0 ipmi_si.trydmi=0 libiscsi.debug_libiscsi_eh=1 loglevel=4 crash_kexec_post_notifiers
grub_users $grub_users
grub_arg --unrestricted
grub_class ol
Now still in grub cmdline run:
linux (hd0,gpt2)/vmlinuz-5.15.0-206.153.7.el9uek.x86_64 root=/dev/mapper/ocivolume-root ro crashkernel=1G-4G:192M,4G-64G:256M,64G-:512M LANG=en_US.UTF-8 console=tty0 console=ttyS0,115200 rd.luks=0 rd.md=0 rd.dm=0 rd.lvm.vg=ocivolume rd.lvm.lv=ocivolume/root rd.net.timeout.dhcp=10 rd.net.timeout.carrier=5 netroot=iscsi:169.254.0.2:::1:iqn.2015-02.oracle.boot:uefi rd.iscsi.param=node.session.timeo.replacement_timeout=6000 net.ifnames=1 nvme_core.shutdown_timeout=10 ipmi_si.tryacpi=0 ipmi_si.trydmi=0 libiscsi.debug_libiscsi_eh=1 loglevel=4 crash_kexec_post_notifiers
initrd (hd0,gpt2)/initramfs-5.15.0-206.153.7.el9uek.x86_64.img
boot
where kernel = kernel form config
options for kernel = options from config
initrd = initrd from config
IMPORTANT: when doing copy/pastes VERIFY that
linux string is a single string, if you have newlines or returns in the buffer - they will NOT be applied.
So when you have full linux string copied - paste it to some file to verify that it is a single string.
do not forget that path is relative to your partition with /boot or /boot partition.
If your /boot is on /root partition, you will need to find the disk with root partition and your paths will be something like
(lvm/volume-root)/boot/
When system is booted run:
grub2-mkconfig > /boot/grub2/grub.cfg
grub2-mkconfig > /boot/efi/EFI/redhat/grub.cfg
@robertm98 the problem is that on OL9, config file for grub2 was switched to parent config in /boot/efi/EFI/redhat/grub.cfg that in order loads proper /boot/grub2/grub.cfg config.
For CERTAIN /boot/efi/EFI/redhat/grub.cfg config contents fix that was applied for leapp in-place upgrade instead of correctly updating configs ( or not touching them ), writes /boot/efi/EFI/redhat/grub.cfg into /boot/grub2/grub.cfg and system chainloops.
Thanks for the instructions, some remarks from my expierence: Running grubx64.efi after grub.cfg was deleted did not automatically put me into grub cmdline but was stuck and I needed to power-cycle the machine. ls (hd0,gpt1) only shows "Filesystems is fat" or "Filesystem is xfs", not actual contents. However ls (hd0,gpt2)/loader/entries would only succeed on the right disk and list its contents, and show not found on all others.
boot was successful, but after login + grub2-mkconfig + reboot it would return to grub cmdline again :/ Reading your latest comment I tried mkconfig to /boot/efi/EFI/redhat/grub.cfg and it seems to work now!
ls (hd0,gpt1)
yeah, you need slash in the end to display content:
ls (hd0,gpt1)/
boot was successful, but after login + grub2-mkconfig + reboot it would return to grub cmdline again :/
OH! yes, that is because /boot/efi/EFI/redhat/grub.cfg was removed from UEFI shell during recovery. I've updated my post to reflect that.
Thank you.
Im not sure if that is related to the original issue but the only thing that is a bit weird now is that grubby shows:
[root@ol9-machine ~]# grubby --default-kernel
/boot/vmlinuz-5.15.0-207.156.6.el9uek.x86_64
[root@ol9-machine ~]# grubby --default-index
3
[root@ol9-machine ~]# grubby --info DEFAULT
index=3
kernel="/boot/vmlinuz-5.15.0-207.156.6.el9uek.x86_64"
args="ro rd.lvm.lv=ol/root rhgb quiet crashkernel=1G-64G:448M,64G-:512M $tuned_params"
root="/dev/mapper/ol-root"
initrd="/boot/initramfs-5.15.0-207.156.6.el9uek.x86_64.img $tuned_initrd"
title="Oracle Linux Server (5.15.0-207.156.6.el9uek.x86_64 with Unbreakable Enterprise Kernel) 9.4"
id="bda9a182a36740ada28baaa218d5c09d-5.15.0-207.156.6.el9uek.x86_64"
And yet, when I reboot it would automatically select index 0 with a kernel that is no longer present in /boot. So the system is usable but wouldnt survive an automated reboot. See screenshot attached.
[root@ol9-machine ~]# uname -r
5.15.0-207.156.6.el9uek.x86_64
[root@ol9-machine ~]# dnf list installed | grep kernel
kernel.x86_64 5.14.0-427.22.1.el9_4 @ol9_baseos_latest
kernel-core.x86_64 5.14.0-427.22.1.el9_4 @ol9_baseos_latest
kernel-modules.x86_64 5.14.0-427.22.1.el9_4 @ol9_baseos_latest
kernel-modules-core.x86_64 5.14.0-427.22.1.el9_4 @ol9_baseos_latest
kernel-tools.x86_64 5.14.0-427.22.1.el9_4 @ol9_baseos_latest
kernel-tools-libs.x86_64 5.14.0-427.22.1.el9_4 @ol9_baseos_latest
kernel-uek.x86_64 5.15.0-207.156.6.el9uek @ol9_UEKR7
kernel-uek-core.x86_64 5.15.0-207.156.6.el9uek @ol9_UEKR7
kernel-uek-modules.x86_64 5.15.0-207.156.6.el9uek @ol9_UEKR7
Any help appreciated.
can you show please
for x in $(find /boot |grep grubenv); do echo $x; cat $x; done
cat /boot/efi/EFI/redhat/grub.cfg |grep grubenv
cat /boot/grub2/grub.cfg |grep grubenv
Sure, here you go:
/boot/grub2/grubenv
# GRUB Environment Block
# WARNING: Do not edit this file by tools other than grub-editenv!!!
saved_entry=bda9a182a36740ada28baaa218d5c09d-5.15.0-207.156.6.el9uek.x86_64
boot_success=1
boot_indeterminate=0
##################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################
/boot/efi/EFI/redhat/grub.cfg
if [ -f ${config_directory}/grubenv ]; then
load_env -f ${config_directory}/grubenv
elif [ -s $prefix/grubenv ]; then
# The kernelopts variable should be defined in the grubenv file. But to ensure that menu
# without a grubenv file, define a fallback kernelopts variable if this has not been set.
# The kernelopts variable in the grubenv file can be modified using the grubby tool or by
# the kernelopts variable in the grubenv file and the fallback kernelopts variable.
/boot/grub2/grub.cfg
if [ -f ${config_directory}/grubenv ]; then
load_env -f ${config_directory}/grubenv
elif [ -s $prefix/grubenv ]; then
# The kernelopts variable should be defined in the grubenv file. But to ensure that menu
# without a grubenv file, define a fallback kernelopts variable if this has not been set.
# The kernelopts variable in the grubenv file can be modified using the grubby tool or by
# the kernelopts variable in the grubenv file and the fallback kernelopts variable.
OK, everything above looks correct. Now ls /boot/loader/entries/
It seems you have some redundant entries there.
[root@ol9-machine grub2]# ls -al /boot/loader/entries/
total 28
drwx------. 2 root root 4096 Jun 25 13:34 .
drwxr-xr-x. 3 root root 21 Oct 17 2022 ..
-rw-r--r--. 1 root root 440 May 22 13:59 495620e0609f491080cb4e769e86283d-0-rescue.conf
-rw-r--r--. 1 root root 381 May 22 13:59 495620e0609f491080cb4e769e86283d-5.14.0-284.30.1.el9_2.x86_64.conf
-rw-r--r--. 1 root root 428 May 22 13:59 495620e0609f491080cb4e769e86283d-5.15.0-200.131.27.el9uek.x86_64.conf
-rw-r--r--. 1 root root 405 May 22 13:59 bda9a182a36740ada28baaa218d5c09d-0-rescue.conf
-rw-r--r--. 1 root root 381 Jun 25 10:18 bda9a182a36740ada28baaa218d5c09d-5.14.0-427.22.1.el9_4.x86_64.conf
-rw-r--r--. 1 root root 424 Jun 25 10:19 bda9a182a36740ada28baaa218d5c09d-5.15.0-207.156.6.el9uek.x86_64.conf
oh, heres the problem - sorry for bothering you - but thanks for pointing me in the right direction. looks like (some script or person) regenerated the machine-id a few weeks ago...
For everyone tracking this issue: grub2 updates that does NOT contain scriptlet bug and, at the same time, resolves the issue for people who had installed broken package, but did not reboot, was published to public repositories:
version is 2.06-80.0.3.el9_4
I'm running Oracle Linux Server 8.9 and am experiencing the same GRUB boot issue discussed here. After updating, my system gets stuck at the GRUB CLI on reboot.
Current package versions offered in my OL8 repos:
grub2-common.noarch 1:2.02-167.0.1.el8_10
grub2-pc.x86_64 1:2.02-167.0.1.el8_10
grub2-pc-modules.noarch 1:2.02-167.0.1.el8_10
grub2-tools.x86_64 1:2.02-167.0.1.el8_10
grub2-tools-efi.x86_64 1:2.02-167.0.1.el8_10
grub2-tools-extra.x86_64 1:2.02-167.0.1.el8_10
grub2-tools-minimal.x86_64 1:2.02-167.0.1.el8_10
# for x in $(find /boot |grep grubenv); do echo $x; cat $x; done
/boot/grub2/grubenv
# GRUB Environment Block
kernelopts=root=UUID=246acc24-9a5e-4f74-96c2-5a0496303213 ro crashkernel=auto LANG=en_US.UTF-8 console=tty0 console=ttyS0,115200n8 rd.luks=0 rd.lvm=0 rd.md=0 rd.dm=0 net.ifnames=1 nvme_core.shutdown_timeout=10 nvme_core.io_timeout=4294967295 ipmi_si.tryacpi=0 ipmi_si.trydmi=0 ipmi_si.trydefaults=0 libiscsi.debug_libiscsi_eh=1 loglevel=4
boot_success=0

# cat /boot/efi/EFI/redhat/grub.cfg |grep grubenv
cat: /boot/efi/EFI/redhat/grub.cfg: No such file or directory
# cat /boot/grub2/grub.cfg |grep grubenv
if [ -f ${config_directory}/grubenv ]; then
load_env -f ${config_directory}/grubenv
elif [ -s $prefix/grubenv ]; then
After update, have not performed reboot yet
# for x in $(find /boot |grep grubenv); do echo $x; cat $x; done
/boot/grub2/grubenv
# GRUB Environment Block
kernelopts=root=UUID=246acc24-9a5e-4f74-96c2-5a0496303213 ro audit=1
boot_success=0

# cat /boot/efi/EFI/redhat/grub.cfg |grep grubenv
cat: /boot/efi/EFI/redhat/grub.cfg: No such file or directory
# cat /boot/grub2/grub.cfg |grep grubenv
if [ -f ${config_directory}/grubenv ]; then
load_env -f ${config_directory}/grubenv
elif [ -s $prefix/grubenv ]; then
# The kernelopts variable should be defined in the grubenv file. But to ensure that menu
# without a grubenv file, define a fallback kernelopts variable if this has not been set.
# The kernelopts variable in the grubenv file can be modified using the grubby tool or by
# the kernelopts variable in the grubenv file and the fallback kernelopts variable.
Regenerating the grub config did not fix the issue and after reboot gets stuck at the GRUB CLI
grub2-mkconfig > /boot/grub2/grub.cfg
grub2-mkconfig > /boot/efi/EFI/redhat/grub.cfg
Any guidance or updates for OL8 users would be appreciated!
it is very unlikely it's the same issue.
When you say stuck and grub CLI - you mean you are dropped to grub command line ?
please run
grub2-mkconfig > /boot/efi/EFI/redhat/grub.cfg
and attach grub.cfg here.
Also verify that you have entries in /boot/loader/entries dir and they are not empty.
Is your system UEFI or legacy bios ? run
ls /sys/firmware/efi/
to verify
Thanks for your response.
Yes, by "stuck at the grub CLI," I mean that after rebooting, the system drops to the grub> command prompt rather than showing the normal boot menu.
This is one of the systems that I ran the update and I have not rebooted yet, and this is BIOS system
# grub2-mkconfig > /boot/efi/EFI/redhat/grub.cfg
Generating grub configuration file ...
done
# ls -la /boot/loader/entries/
total 28
drwx------. 2 root root 4096 Jun 11 19:53 .
drwxr-xr-x. 3 root root 21 Feb 2 2024 ..
-rw-r--r-- 1 root root 333 Jun 11 19:51 ec23b0f92923f5903d19560197e45e85-4.18.0-513.5.1.el8_9.x86_64.conf
-rw-r--r-- 1 root root 344 Jun 11 19:51 ec23b0f92923f5903d19560197e45e85-4.18.0-553.22.1.el8_10.x86_64.conf
-rw-r--r-- 1 root root 344 Jun 11 19:52 ec23b0f92923f5903d19560197e45e85-4.18.0-553.56.1.el8_10.x86_64.conf
-rw-r--r-- 1 root root 356 Jun 11 19:51 ec23b0f92923f5903d19560197e45e85-5.15.0-200.131.27.el8uek.x86_64.conf
-rw-r--r-- 1 root root 356 Jun 11 19:51 ec23b0f92923f5903d19560197e45e85-5.15.0-300.163.18.el8uek.x86_64.conf
-rw-r--r-- 1 root root 384 Jun 11 19:53 ec23b0f92923f5903d19560197e45e85-5.15.0-309.180.4.el8uek.x86_64.conf
# ls -la /sys/firmware/efi
ls: cannot access '/sys/firmware/efi': No such file or directory
# [[ -d /sys/firmware/efi ]] && echo UEFI || echo BIOS
BIOS
Here is the grub.cfg
I think I know what might be the problem. Did you run grub2-install after update ? If not - you should.
I did run grub2-install; do I need to regenerate the grub config again after the grub2-install?
# grub2-install
Installing for i386-pc platform.
grub2-install: error: install device isn't specified.
# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
nvme0n1 259:0 0 16G 0 disk
└─nvme0n1p1 259:1 0 16G 0 part /
# grub2-install /dev/nvme0n1
Installing for i386-pc platform.
Installation finished. No error reported.
FYI. The system booted without any issues after grub2-install.
You should not need to regenerate grub config after grub update. On legacy ( BIOS/non-UEFI ) systems it is mandatory to rerun grub2-install after grub2 updates. This is needed so that code in core.img matches actual modules. NOT rerunning grub2-install might be fine for a while, if code is not changing much, but from time to time you might faces issues, like you did.
Oracle Linux customers, please file your issue at https://support.oracle.com
Thanks for filing an issue with Oracle Linux.
GitHub Issues is not an official support channel and we don't offer product support here. If you're not yet an Oracle Linux customer, consider signing up at https://linux.oracle.com.
Even if you're not a customer, if we can confirm that an issue is a bug we will do our best to fix it and to update this issue once it has been fixed. We don't guarantee a fix or feedback and for now, we will close this issue. If you have Oracle Linux support, please use support.oracle.com to report issues.