Bug 13096

Summary: CU174 - Downloading a backup file from IPFire above a certain size causes an OOM killer event.
Product: IPFire Reporter: Adolf Belka <adolf.belka>
Component: ---Assignee: Stefan Schantl <stefan.schantl>
Status: CLOSED FIXED QA Contact:
Severity: Major Usability    
Priority: Will affect most users CC: jon.murphy, michael.tremer, peter.mueller, stefan.schantl
Version: 2   
Hardware: unspecified   
OS: Unspecified   
Attachments: kernel error messages when trying to download a large backup file

Description Adolf Belka 2023-05-08 17:37:05 UTC
Created attachment 1156 [details]
kernel error messages when trying to download a large backup file

This was flagged up by a couple of people on the forum.

Bug was not raised after 2 weeks so I am raising it. I have confirmed the effect.

On CU173 all of my backup files were able to be downloaded with no problems, irrespective of the size.

On CU174 trying to download a backup file that was 99MB or greater caused an OOM killer event that stopped the backup.cgi process when 100% of memory and a large amount of the swap were being used.

61MB was able to be downloaded with no problems.

Where between 99MB ad 61MB the actual boundary is I don't know as I did not have any backups with a size between these.

When the problem occurs the downloads box to select whether to open it or to save it in a specific location never comes up.
Comment 1 Jon 2023-05-09 02:40:48 UTC
I see the same issue.  

I'd be happy to help test.
Comment 2 Michael Tremer 2023-05-09 07:30:50 UTC
So this can be easily fixed as the problem is here:

> https://git.ipfire.org/?p=ipfire-2.x.git;a=blob;f=html/cgi-bin/backup.cgi;hb=78218433ad12bd4e34e50fac8f72668eac988eb2#l357

In this function, we load the entire backup file into memory and we will then try to send it to the client. I suppose some changes in the allocator will make Perl panic or something. Generally loading 100 MiB of data should not be a problem at all...

However, what we can do is the following:

> open(FILE, "$file");
> binmode(FILE);
> binmode(STDOUT);
> while (<FILE>) {
>   print;
> }
> close(FILE);

That will open the file, read a bit, send that to the client and then read the next bit and so on...

I forgot who volunteered to work on this in yesterdays, call. So I wasn't sure whether I should just go ahead :)
Comment 3 Adolf Belka 2023-05-09 11:57:45 UTC
(In reply to Michael Tremer from comment #2)
> I forgot who volunteered to work on this in yesterdays, call. So I wasn't
> sure whether I should just go ahead :)

I don't think a specific name was defined. It was just said that someone should fix it for CU175.
Comment 4 Jon 2023-05-09 16:16:24 UTC
I suggest volunteering Michael since the:

> while (<FILE>) {
>   print;
> }
> close(FILE);

... loop make no sense to me!
Comment 5 Michael Tremer 2023-05-10 09:22:27 UTC
(In reply to Jon from comment #4)
> ... loop make no sense to me!

Oh yeah, that is Perl for you :) Most of it makes very little sense.

@Stefan: Would you like to take this on, please?
Comment 7 Peter Müller 2023-05-11 20:33:25 UTC

Thank you very much indeed, Stefan! :-)
Comment 8 Adolf Belka 2023-05-15 08:34:10 UTC
I have tested this out on my vm testbed using

Core-Update 175 Development Build: next/ccd793b3 

and can confirm that large files are again being downloaded successfully that in CU174 were failing with the OOM.
Comment 9 Adolf Belka 2023-05-20 13:34:58 UTC
Core Update 175 Testing has been released.

Comment 10 Adolf Belka 2023-05-20 13:36:30 UTC
Tested downloading a large backup file on my vm testbed with CU175 that previously gave the oom killer response with CU174.

The fix successfully downloaded a 210MB file.
Comment 11 Adolf Belka 2023-06-12 19:19:45 UTC
CU175 released