Bug 12548 - High CPU utilization in C153 Testing
Summary: High CPU utilization in C153 Testing
Status: ASSIGNED
Alias: None
Product: IPFire
Classification: Unclassified
Component: --- (show other bugs)
Version: 2
Hardware: unspecified Unspecified
: Will affect an average number of users Major Usability
Assignee: Stefan Schantl
QA Contact:
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-12-12 00:51 UTC by Fred Kienker
Modified: 2021-03-06 11:08 UTC (History)
3 users (show)

See Also:


Attachments
Load graph (234.25 KB, image/png)
2020-12-12 00:51 UTC, Fred Kienker
Details
Load graph for 5.0.5 (idle) (9.62 KB, image/png)
2021-02-16 10:34 UTC, Matthias Fischer
Details
Load graph for patched 6.0.1 (idle) (10.82 KB, image/png)
2021-02-16 10:37 UTC, Matthias Fischer
Details
6.0.1 patched with usleep(5000) (11.42 KB, image/png)
2021-02-17 13:24 UTC, Matthias Fischer
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Fred Kienker 2020-12-12 00:51:32 UTC
Created attachment 815 [details]
Load graph

After updating to C153 Testing from C152 Stable, Suricata uses 2 to 3 times the CPU percentage it did previously. Changes show up on the CPU and the Load graphs and consistently put Suricata at the top of the htop list. Systems with plenty of CPU resources are okay but slower systems may overload.
Comment 1 Michael Tremer 2020-12-12 09:39:17 UTC
> https://forum.suricata.io/t/cpu-usage-of-version-6-0-0/706

This is known upstream, but no solution is available, yet.
Comment 2 Matthias Fischer 2020-12-12 10:14:33 UTC
FYI:
As a temporary 'workaround(?) I pushed a 'downdate' to 'suricata 5.0.5' today.

This version is running here without any changes in cpu load. Runs like 5.0.4.

Don't know if we'll like that.

I also updated 'libhtp' (used by 'suricata') to 0.5.36.
Comment 3 Michael Tremer 2020-12-15 16:10:57 UTC
> https://redmine.openinfosecfoundation.org/issues/4096#note-26

I have posted on the upstream bugtracker.

I believe that we might have to pull the release and rebuild it all again with a downgraded suricata.
Comment 4 Michael Tremer 2021-02-15 13:25:25 UTC
A fix has been posted upstream:

> https://github.com/OISF/suricata/pull/5840/commits/17a38f1823adeb9eb059f666686e35509f3a13d2
Comment 5 Matthias Fischer 2021-02-15 13:51:19 UTC
Thanks!

Working on it. ('Devel' is running...)
Comment 6 Matthias Fischer 2021-02-16 10:34:40 UTC
Created attachment 858 [details]
Load graph for 5.0.5 (idle)
Comment 7 Matthias Fischer 2021-02-16 10:36:07 UTC
Tested with 5.0.5 - compared to a patched 6.0.1 - see attachments (idle load).

IMHO there is NO significant improvements. The cpu load is almost as high as before.

Running on Core 153 /x86_64.

Profil-ID:
https://fireinfo.ipfire.org/profile/5f68a6360ffbecb6877dcac75f5b8c8030f43ce8
Comment 8 Matthias Fischer 2021-02-16 10:37:29 UTC
Created attachment 859 [details]
Load graph for patched 6.0.1 (idle)
Comment 9 Michael Tremer 2021-02-16 11:41:43 UTC
I posted your findings upstream on the same ticket and requested for it to be reopened since this fix does not actually fix the problem.

I have no idea how we can help the suricata team apart from testing and validating any proposed fixes.
Comment 10 Matthias Fischer 2021-02-16 12:50:36 UTC
Just looked around a bit.

Searching for 'usleep()' found a lot of pages, declaring this function as "obsolete, use nanosleep() instead". Always assuming that usleep is the culprit in this case.

But I have no idea how to change that.
Comment 11 Michael Tremer 2021-02-16 19:11:22 UTC
It looks like our glibc is using nanosleep internally when calling usleep():

> https://git.ipfire.org/?p=thirdparty/glibc.git;a=blob;f=sysdeps/posix/usleep.c;hb=25251c0707fe34f30a27381a5fabc35435a96621
Comment 12 Matthias Fischer 2021-02-17 13:24:59 UTC
Created attachment 860 [details]
6.0.1 patched with usleep(5000)

Based on:

https://redmine.openinfosecfoundation.org/issues/4096#note-23

(It looks like changing the usleep value to 200 already gives a big improvement, but setting it much higher to something like 5000 gives better results.), I'm now testing with "usleep (5000);".

First impressions: CPU load is significantly lower (0.6-~2%) compared to the previous build, raising to ~35% during a 100MBit download (1GB).

But: "Need to look at what the impact is of changing this."

I cannot judge what the consequences might be, too...
Comment 13 Michael Tremer 2021-03-06 11:08:21 UTC
There is some movement to track this:

> https://redmine.openinfosecfoundation.org/issues/4379

This ticket/change is targeted for suricata 7, which seems to suggest that we are going to skip the 6 series.