Initial reboot succeeds, leaves console hung. Successive reboots fail and hang system in unusable state.
This is an awesome idea and is likely to be a very desirable feature, especially for platform engineers, DevOps, and IT admins. Thank you for publishing.
Environment: OS: Ubuntu 24 Server (Noble) minimal install. Physical hardware (intel based).
Ran the following to install (0 issues reported): sudo apt install --no-install-recommends cryptsetup-initramfs kexec-tools ruby strace systemd sudo gem install crypt_reboot
first pass:
sudo cryptreboot
[sudo] password for name:
Extracting initramfs... To speed things up, future versions will employ cache.
Please unlock disk pvpart-crypt:
Broadcast message from [email protected] on pts/1 (Thu 2024-12-19 18:40:50 EST):
The system will kexec now!
system reboots as expected, however the console shows nothing (kvm directly attached). The system however is accessible via ssh. Reports it was rebooted checking uptime.
Then while checking dmesg over ssh shows page fault:
[Thu Dec 19 18:40:59 2024] BUG: unable to handle page fault for address: ffffa07500900000
[Thu Dec 19 18:40:59 2024] #PF: supervisor read access in kernel mode
[Thu Dec 19 18:40:59 2024] #PF: error_code(0x0000) - not-present page
[Thu Dec 19 18:40:59 2024] PGD 100000067 P4D 100000067 PUD 10029f067 PMD 10a650067 PTE 0
[Thu Dec 19 18:40:59 2024] Oops: 0000 [#1] PREEMPT SMP PTI
[Thu Dec 19 18:40:59 2024] CPU: 7 PID: 602 Comm: (udev-worker) Not tainted 6.8.0-49-generic #49-Ubuntu
[Thu Dec 19 18:40:59 2024] RIP: 0010:ioread32+0x3a/0x80
[Thu Dec 19 18:40:59 2024] Code: 76 0e 89 fa ed 31 d2 31 f6 31 ff c3 cc cc cc cc 8b 05 da e2 e4 01 85 c0 75 1d b8 ff ff ff ff 31 d2 31 f6 31 ff c3 cc cc cc cc <8b> 07 31 d2 31 f6 31 ff c3 cc cc cc cc 55 83 e8 01 48 89 fe 48 c7
[Thu Dec 19 18:40:59 2024] RSP: 0018:ffffa07501e63178 EFLAGS: 00010292
[Thu Dec 19 18:40:59 2024] RAX: ffffffffc14e6150 RBX: ffff88e5c5fb4600 RCX: ffff88e5c5fb46b0
[Thu Dec 19 18:40:59 2024] RDX: 0000000000000000 RSI: ffffa07500900000 RDI: ffffa07500900000
[Thu Dec 19 18:40:59 2024] RBP: ffffa07501e63180 R08: 0000000000000000 R09: 0000000000000000
[Thu Dec 19 18:40:59 2024] R10: 0000000000000000 R11: 0000000000000000 R12: ffffa07501e631c4
[Thu Dec 19 18:40:59 2024] R13: ffffa07501e632c0 R14: ffff88e5c5fb4608 R15: ffffa07501e631d0
[Thu Dec 19 18:40:59 2024] FS: 00007bd5006628c0(0000) GS:ffff88ecfe380000(0000) knlGS:0000000000000000
[Thu Dec 19 18:40:59 2024] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Thu Dec 19 18:40:59 2024] CR2: ffffa07500900000 CR3: 0000000103fc6004 CR4: 00000000003706f0
[Thu Dec 19 18:40:59 2024] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[Thu Dec 19 18:40:59 2024] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[Thu Dec 19 18:40:59 2024] Call Trace:
[Thu Dec 19 18:40:59 2024] <TASK>
[Thu Dec 19 18:40:59 2024] ? show_regs+0x6d/0x80
[Thu Dec 19 18:40:59 2024] ? __die+0x24/0x80
[Thu Dec 19 18:40:59 2024] ? page_fault_oops+0x99/0x1b0
[Thu Dec 19 18:40:59 2024] ? kernelmode_fixup_or_oops.isra.0+0x69/0x90
[Thu Dec 19 18:40:59 2024] ? __bad_area_nosemaphore+0x19d/0x2c0
[Thu Dec 19 18:40:59 2024] ? bad_area_nosemaphore+0x16/0x30
[Thu Dec 19 18:40:59 2024] ? do_kern_addr_fault+0x7b/0xa0
[Thu Dec 19 18:40:59 2024] ? exc_page_fault+0x1a4/0x1b0
[Thu Dec 19 18:40:59 2024] ? asm_exc_page_fault+0x27/0x30
[Thu Dec 19 18:40:59 2024] ? __pfx_nv50_instobj_rd32+0x10/0x10 [nouveau]
[Thu Dec 19 18:40:59 2024] ? ioread32+0x3a/0x80
[Thu Dec 19 18:40:59 2024] ? nv50_instobj_rd32+0x15/0x20 [nouveau]
[Thu Dec 19 18:40:59 2024] gp102_acr_wpr_patch+0xc3/0x1f0 [nouveau]
[Thu Dec 19 18:40:59 2024] nvkm_acr_oneinit+0x41f/0x6c0 [nouveau]
[Thu Dec 19 18:40:59 2024] nvkm_subdev_oneinit_+0x53/0x130 [nouveau]
[Thu Dec 19 18:40:59 2024] nvkm_subdev_init_+0x40/0x150 [nouveau]
[Thu Dec 19 18:40:59 2024] nvkm_subdev_init+0x50/0x70 [nouveau]
[Thu Dec 19 18:40:59 2024] nvkm_device_init+0x17c/0x310 [nouveau]
[Thu Dec 19 18:40:59 2024] nvkm_udevice_init+0x50/0x60 [nouveau]
[Thu Dec 19 18:40:59 2024] nvkm_object_init+0x3f/0x1e0 [nouveau]
[Thu Dec 19 18:40:59 2024] nvkm_ioctl_new+0x192/0x2e0 [nouveau]
[Thu Dec 19 18:40:59 2024] ? __pfx_nvkm_client_child_new+0x10/0x10 [nouveau]
[Thu Dec 19 18:40:59 2024] ? __pfx_nvkm_udevice_new+0x10/0x10 [nouveau]
[Thu Dec 19 18:40:59 2024] nvkm_ioctl+0x132/0x2b0 [nouveau]
[Thu Dec 19 18:40:59 2024] nvkm_client_ioctl+0xe/0x20 [nouveau]
[Thu Dec 19 18:40:59 2024] nvif_object_ctor+0x10a/0x1a0 [nouveau]
[Thu Dec 19 18:40:59 2024] nvif_device_ctor+0x22/0x90 [nouveau]
[Thu Dec 19 18:40:59 2024] nouveau_cli_init+0x163/0x650 [nouveau]
[Thu Dec 19 18:40:59 2024] ? nouveau_drm_device_init+0x5e/0x370 [nouveau]
[Thu Dec 19 18:40:59 2024] nouveau_drm_device_init+0xba/0x370 [nouveau]
[Thu Dec 19 18:40:59 2024] nouveau_drm_probe+0x137/0x280 [nouveau]
[Thu Dec 19 18:40:59 2024] local_pci_probe+0x44/0xb0
[Thu Dec 19 18:40:59 2024] pci_call_probe+0x55/0x1a0
[Thu Dec 19 18:40:59 2024] pci_device_probe+0x84/0x120
[Thu Dec 19 18:40:59 2024] really_probe+0x1c4/0x410
[Thu Dec 19 18:40:59 2024] __driver_probe_device+0x8c/0x180
[Thu Dec 19 18:40:59 2024] driver_probe_device+0x24/0xd0
[Thu Dec 19 18:40:59 2024] __driver_attach+0x10b/0x210
[Thu Dec 19 18:40:59 2024] ? __pfx___driver_attach+0x10/0x10
[Thu Dec 19 18:40:59 2024] bus_for_each_dev+0x8a/0xf0
[Thu Dec 19 18:40:59 2024] driver_attach+0x1e/0x30
[Thu Dec 19 18:40:59 2024] bus_add_driver+0x14e/0x290
[Thu Dec 19 18:40:59 2024] driver_register+0x5e/0x130
[Thu Dec 19 18:40:59 2024] ? __pfx_nouveau_drm_init+0x10/0x10 [nouveau]
[Thu Dec 19 18:40:59 2024] __pci_register_driver+0x5e/0x70
[Thu Dec 19 18:40:59 2024] nouveau_drm_init+0x177/0xff0 [nouveau]
[Thu Dec 19 18:40:59 2024] do_one_initcall+0x5b/0x340
[Thu Dec 19 18:40:59 2024] do_init_module+0x97/0x290
[Thu Dec 19 18:40:59 2024] load_module+0xba1/0xcf0
[Thu Dec 19 18:40:59 2024] init_module_from_file+0x96/0x100
[Thu Dec 19 18:40:59 2024] ? init_module_from_file+0x96/0x100
[Thu Dec 19 18:40:59 2024] idempotent_init_module+0x11c/0x2b0
[Thu Dec 19 18:40:59 2024] __x64_sys_finit_module+0x64/0xd0
[Thu Dec 19 18:40:59 2024] x64_sys_call+0x1d6e/0x25c0
[Thu Dec 19 18:40:59 2024] do_syscall_64+0x7f/0x180
[Thu Dec 19 18:40:59 2024] ? vfs_read+0x2c7/0x390
[Thu Dec 19 18:40:59 2024] ? vfs_read+0x2c7/0x390
[Thu Dec 19 18:40:59 2024] ? rseq_get_rseq_cs+0x22/0x280
[Thu Dec 19 18:40:59 2024] ? rseq_ip_fixup+0x90/0x1f0
[Thu Dec 19 18:40:59 2024] ? syscall_exit_to_user_mode+0x86/0x260
[Thu Dec 19 18:40:59 2024] ? do_syscall_64+0x8c/0x180
[Thu Dec 19 18:40:59 2024] ? vfs_read+0x2c7/0x390
[Thu Dec 19 18:40:59 2024] ? vfs_read+0x2c7/0x390
[Thu Dec 19 18:40:59 2024] ? rseq_get_rseq_cs+0x22/0x280
[Thu Dec 19 18:40:59 2024] ? rseq_ip_fixup+0x90/0x1f0
[Thu Dec 19 18:40:59 2024] ? syscall_exit_to_user_mode+0x86/0x260
[Thu Dec 19 18:40:59 2024] ? do_syscall_64+0x8c/0x180
[Thu Dec 19 18:40:59 2024] ? irqentry_exit+0x43/0x50
[Thu Dec 19 18:40:59 2024] entry_SYSCALL_64_after_hwframe+0x78/0x80
[Thu Dec 19 18:40:59 2024] RIP: 0033:0x7bd50052725d
system otherwise seems functional aside from console hung/unresponsive. a second attempt at a reboot, and the system becomes entirely inaccessible (both console and ssh).
hard reboot (power button) and system is normal.
i was hoping to use this for automated reboots as part of ansible playbook, to which i already have an install task defined, and an asynch reboot/poll task which appears to work once only.
this occurs whether i manually type the reboot over ssh, or have ansible doing it.
Hi, @haroules thank you for a detailed bug report and kind words, I really appreciate that!
I'm busy with other things right now, so I don't have much time left to debug this issue currently. However, what you described suggests a kernel-level issue. Cryptreboot doesn't do any fancy kernel-level stuff. It just appends a cpio archive with 2 or 3 files to initramfs which is a standard way of extending it (most initramfs are composed of at least 2 cpio archives).
Therefore I suspect general kexec failure. I mean performing the raw kexec (without patching initramfs):
kexec -al /boot/vmlinuz --initrd /boot/initrd.img --reuse-cmdline
will lead to the same issues with the exception that the system will require you to provide the passphrase during boot.
If that's the case, the task is to find the kexec/kernel bug report and check what can be done, or if there is no report - create one. As I said, I can't do it right now, but I would be grateful for any info on this.
Hi Pawel,
My ubuntu host doesn't have the kexec tools loaded. I'll get to that and then try what you asked and report back.
I did do some googling on the errors and i noticed some concerns with SGX being enabled. I'll be disabling that to see if it has any effect.
Thanks for getting back to me.
-Tony
On 2024-12-21 04:23, Paweł Pokrywka wrote:
Hi, @haroules [1] thank you for a detailed bug report and kind words, I really appreciate that!
I'm busy with other things right now, so I don't have much time left to debug this issue currently. However, what you described suggests a kernel-level issue. Cryptreboot doesn't do any fancy kernel-level stuff. It just appends a cpio archive with 2 or 3 files to initramfs which is a standard way of extending it (most initramfs are composed of at least 2 cpio archives).
Therefore I suspect general kexec failure. I mean performing the raw kexec (without patching initramfs):
kexec -al /boot/vmlinuz --initrd /boot/initrd.img --reuse-cmdline
will lead to the same issues with the exception that the system will require you to provide the passphrase during boot.
If that's the case, the task is to find the kexec/kernel bug report and check what can be done, or if there is no report - create one. As I said, I can't do it right now, but I would be grateful for any info on this.
-- Reply to this email directly, view it on GitHub [2], or unsubscribe [3]. You are receiving this because you were mentioned.Message ID: @.***>
Links:
[1] https://github.com/haroules [2] https://github.com/phantom-node/cryptreboot/issues/9#issuecomment-2558061647 [3] https://github.com/notifications/unsubscribe-auth/AEO3VCY7MOLQPPFOWL46G6D2GUXP7AVCNFSM6AAAAABT6B52ISVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKNJYGA3DCNRUG4 --=_2ebed78bd1ce747ee22407c72349b943 Content-Type: multipart/related; boundary="=_637fc5e2c72c1e152410526cf1c70c9e"
--=_637fc5e2c72c1e152410526cf1c70c9e Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=UTF-8
Hi Pawel,
My ubuntu host doesn't have the kexec tools loaded. I'll get to that and= then try what you asked and report back.
I did do some googling on the errors and i noticed some concerns with SG= X being enabled. I'll be disabling that to see if it has any effect.
Thanks for getting back to me.
-Tony
On 2024-12-21 04:23, Pawe=C5=82 Pokrywka wrote:
Hi, thank you for a detailed bug report and kind words, I really appre= ciate that!
I'm busy with other things right now, so I don't have much = time left to debug this issue currently.
However, what you described s= uggests a kernel-level issue. Cryptreboot doesn't do any fancy kernel-level= stuff. It just appends a cpio archive with 2 or 3 files to initramfs which= is a standard way of extending it (most initramfs are composed of at least= 2 cpio archives).Therefore I suspect general kexec failure. I mean performin= g the raw kexec (without patching initramfs):
kexec -al /boot/= vmlinuz --initrd /boot/initrd.img --reuse-cmdlinewill lead to the same issues with the exception that the sy= stem will require you to provide the passphrase during boot.
If that's the case, the task is to find the kexec/kernel bu= g report and check what can be done, or if there is no report - create one.= As I said, I can't do it right now, but I would be grateful for any info o= n this.
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are = receiving this because you were mentioned.Message ID: <phantom-node/cryptreboot/issues/9/2558061= 647@github.com>= n>
--=_637fc5e2c72c1e152410526cf1c70c9e Content-Transfer-Encoding: base64 Content-ID: @.***> Content-Type: image/gif; name=blocked.gif Content-Disposition: inline; filename=blocked.gif; size=118
R0lGODlhZAAyAIAAAPrOzgAAACH5BAAAAAAALAAAAABkADIAAAJNhI+py+0Po5y02ouz3rz7D4bi SJbmiabqyrbuC8fyTNf2jef6zvf+DwwKh8Si8YhMKpfMpvMJjUqn1Kr1is1qt9yu9wsOi8fksvls KwAAOw== --=_637fc5e2c72c1e152410526cf1c70c9e--
--=_2ebed78bd1ce747ee22407c72349b943--