Bug 12892

Summary: CU169 Testing - Kernel panic then reboots in a loop
Product: IPFire Reporter: Adolf Belka <adolf.belka>
Component: ---Assignee: Michael Tremer <michael.tremer>
Status: CLOSED FIXED QA Contact:
Severity: Crash    
Priority: - Unknown - CC: Christof.Misch, michael.tremer, peter.mueller, sven.friedrich
Version: 2   
Hardware: unspecified   
OS: Unspecified   
Attachments: Picture after kernel panic has occurred
attachment-1105335-0.html

Description Adolf Belka 2022-06-29 07:00:46 UTC
Created attachment 1061 [details]
Picture after kernel panic has occurred

Installed CU169 with no errors showing up but then when rebooting the attached message about a kernel panic comes up and then reboots after 10 secs. That reboot follows the same path.

The Testing release was on a Virtualbox vm testbed system with red, green, blue and orange networks with two virtual arch linux machines on each of the internal networks. Red connects to my production IPFire machine for its IP.
Comment 1 Adolf Belka 2022-06-29 07:03:49 UTC
A couple of people have also reported the same issue but with the Xen Server on the Community Forum

https://community.ipfire.org/t/dev-169-on-xen-server-not-booting-up/8154
Comment 2 Adolf Belka 2022-06-29 14:39:55 UTC
I have noticed that with CU169 there is no initramfs in the boot directory while in CU168 there was. Could this be playing any part in this problem.

CU168
ls -hal /boot/
total 39M
drwxr-xr-x  5 root root 4.0K Apr 26 15:42 .
drwxr-xr-x 21 root root 4.0K May 16 20:46 ..
-rw-r--r--  1 root root 179K Apr 26 13:00 config-5.15.35-ipfire
drwxr-xr-x  3 root root  16K Jan  1  1970 efi
drwxr-xr-x  6 root root 4.0K Jun 14 13:19 grub
-rw-------  1 root root  27M Jun 14 13:19 initramfs-5.15.35-ipfire.img
drwx------  2 root root  16K May 16 20:46 lost+found
-rw-r--r--  1 root root 4.7M Apr 26 13:00 System.map-5.15.35-ipfire
-rw-r--r--  1 root root 6.7M Apr 26 13:00 vmlinuz-5.15.35-ipfire


CU169 after upgrade from CU168 but before reboot
ls -hal /boot/
total 12M
drwxr-xr-x  5 root root 4.0K Jun 27 23:50 .
drwxr-xr-x 21 root root 4.0K May 16 20:46 ..
-rw-r--r--  1 root root 179K Jun 27 21:01 config-5.15.49-ipfire
drwxr-xr-x  3 root root  16K Jan  1  1970 efi
drwxr-xr-x  6 root root 4.0K Jun 29 16:22 grub
drwx------  2 root root  16K May 16 20:46 lost+found
-rw-r--r--  1 root root 4.7M Jun 27 21:01 System.map-5.15.49-ipfire
-rw-r--r--  1 root root 6.8M Jun 27 21:01 vmlinuz-5.15.49-ipfire


Looking through the CU169 log there is no dracut command creating the initramfs image file but there is a remove line which deletes the initramfs image file that was present from CU168.
Comment 3 Adolf Belka 2022-06-29 15:15:42 UTC
I downloaded CU169 master/f5117ab5 which was the same as the version upgraded by pakfire.

I installed this onto the same vm and this installed and rebooted without any problems.

After rebooting I checked the /boot/ directory and initramfs-5.15.49-ipfire.img was present in the directory.

So it looks like this image file is being created with a fresh install of CU169 Testing but not with a pakfire upgrade on virtual systems.
Comment 4 Michael Tremer 2022-06-29 16:44:27 UTC
(In reply to Adolf Belka from comment #0)
> The Testing release was on a Virtualbox vm testbed system with red, green,
> blue and orange networks with two virtual arch linux machines on each of the
> internal networks. Red connects to my production IPFire machine for its IP.

A fresh installation boots just fine.

However, I do not know why we don't ship or generate the initramfs.

Could you try running this after the update before rebooting:

> dracut --regenerate-all --force
Comment 5 Adolf Belka 2022-06-29 18:03:53 UTC
(In reply to Michael Tremer from comment #4)
> 
> Could you try running this after the update before rebooting:
> 
> > dracut --regenerate-all --force

Running that command created the initramfs file in the boot directory but the reboot then still gave the kernel panic and rebooting after 10 secs followed by the kernel panic again etc etc etc.

I was really hopeful that it would solve the problem as a full fresh install works fine and the missing initramfs was a difference I had found, but obviously not the  cause of this issue.

Let me know about any other commands I should try out or data that needs to be checked. I will create several clones of my CU168 vm to test things out on.
Comment 6 Michael Tremer 2022-06-29 18:04:50 UTC
Is it possible that GRUB doesn’t try to load it any more because it wasn’t present when grub.cfg was generated?
Comment 7 Adolf Belka 2022-06-29 18:09:48 UTC
Ah, good point. So I should try regenerating grub.cfg again after running the dracut command.
Comment 9 Michael Tremer 2022-06-29 18:36:00 UTC
Copying Peter for reference.
Comment 10 Adolf Belka 2022-06-29 18:40:25 UTC
Success.

Running grub-mkconfig after the dracut command and then rebooting resulted in a normal boot startup.
Comment 11 Michael Tremer 2022-06-29 18:42:38 UTC
Yay! Thanks for the feedback.
Comment 12 Adolf Belka 2022-06-29 18:44:30 UTC
The bit I don't understand is why are the complaints about boot failures only from people running virtual machine setups?
Comment 13 Michael Tremer 2022-06-29 18:46:09 UTC
(In reply to Adolf Belka from comment #12)
> The bit I don't understand is why are the complaints about boot failures
> only from people running virtual machine setups?

The generic driver for AHCI devices (that will probably cover 99% of SSDs and HDDs out there) is compiled into the kernel. The drivers for the Xen frontend and for VirtIO are not compiled into the kernel. They are modules instead which have to be loaded in order to talk to the disks. And because there was no initramdisk, there were no modules and therefore no storage.
Comment 14 Christof Misch 2022-06-29 19:37:08 UTC
Hi
Have had the same Problem (runing XEN)
As i am running the testing environment i always check if the generated initrd contains the xen drivers.
   lsinitrd /boot/initramfs-* | grep xen
should show xen-blkfront driver
I noticed that there was no initrd. Found this already raised bug report.

Short fix for this was (as the commands are not in this bugreport) running:
   dracut --regenerate-all --force --early-microcode --strip --verbose --xz
   grub-mkconfig -o /boot/grub/grub.cfg
rebooting
fixed it
system is up.
checking for other issues
Comment 15 Peter Müller 2022-06-29 19:45:30 UTC
(In reply to Michael Tremer from comment #9)
> Copying Peter for reference.

Ah, zut alors, my fault again. Thank you for the prompt fix!
Comment 16 Adolf Belka 2022-06-30 08:40:22 UTC
I can confirm that the fix merged into Testing build master/2fcfe2e1 works fine and my Virtualbox vm IPFire reboots with no problems after the upgrade to CU169
Comment 17 Adolf Belka 2022-06-30 11:15:05 UTC
Changed status to verified as the test was done with the version of CU169 Testing accessed via Pakfire.
Comment 18 Christof Misch 2022-06-30 11:55:43 UTC
I can also confirm that the fix merged into Testing build master/2fcfe2e1 works on Xen client. IPFire reboots with no problems after the upgrade to CU169.

Did you really include the initrd into the update package instead calling dracut?
Comment 19 Adolf Belka 2022-06-30 12:18:56 UTC
(In reply to Christof Misch from comment #18)
> 
> Did you really include the initrd into the update package instead calling
> dracut?

What was included in the update package was the rootfile linux-initrd.
This contains the file specification boot/initramfs-KVER-ipfire.img and this is generated by the lfs file linux-initrd which creates all the required initramfs images. There is an image file created for each architecture.

When the filelist for the update was put together, the linux-initrd file was missed.


The dracut command was just a way of duplicating what would have been done during the build to create the image file.
Comment 20 Christof Misch 2022-06-30 12:46:02 UTC
I saw the git diff of michael and his comment:
   "This will probably add a lot of extra size to the updater"
So you will ship 3 different initrd's in the update package instead fixing the update.sh script as you have done in previous versions.
As short fix i think it's ok but should not be the default.
Comment 21 Adolf Belka 2022-06-30 17:28:51 UTC
I removed the Virtual Machine from the title because in the forum thread there was one report of the same effect with Dell hardware, so that was also probably not using standard hardware.
Comment 22 Sven F 2022-07-01 11:26:54 UTC
2FA tested again with CU169 next/8000bc0a and works fine
Comment 23 Sven F 2022-07-01 11:28:27 UTC
(In reply to Sven F from comment #22)
> 2FA tested again with CU169 next/8000bc0a and works fine

sry I mean booting the system :-) but 2FA works also fine.
Comment 25 Sven F 2022-07-12 10:35:03 UTC
Created attachment 1066 [details]
attachment-1105335-0.html

Sehr geehrte Damen und Herren,

vielen Dank für Ihre Nachricht! Ich bin bis einschließlich 22.07.22 nicht im Haus.

Ich werde ihre Mails nach meiner Rückkehr beantworten, ihre Mails werden nicht automatisch weitergeleitet.

Mit freundlichen Grüßen,

Sven Friedrich

Thank you for your message. I am currently out of office with no access to e-mail. I will be back on 25.07.2022 and will answer your e-mail as quickly as possible. Your e-mail will not be forwarded.

br
Sven Friedrich