Bug 12763 - "FIB table does not exist" message showing up at boot
Summary: "FIB table does not exist" message showing up at boot
Status: MODIFIED
Alias: None
Product: IPFire
Classification: Unclassified
Component: --- (show other bugs)
Version: 2
Hardware: unspecified Unspecified
: Will affect all users Aesthetic Issue
Assignee: Adolf Belka
QA Contact:
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-01-17 09:30 UTC by Michael Tremer
Modified: 2024-04-24 15:14 UTC (History)
3 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Michael Tremer 2022-01-17 09:30:04 UTC
For a little while, when the system boots up, there is a message that shouldn't be there which says: "FIB table does not exist"

I didn't investigate where it is coming from, but I think it is confusing and probably not a problem, but has alarmed a couple of users.

@Peter: Would you like to grab this one?
Comment 1 Adolf Belka 2022-04-25 14:11:53 UTC
I found the following issue in iproute2 github

https://github.com/shemminger/iproute2/issues/32

The issue was closed without being fixed.
Comment 2 Michael Tremer 2022-05-03 11:22:28 UTC
(In reply to Adolf Belka from comment #1)
> I found the following issue in iproute2 github
> 
> https://github.com/shemminger/iproute2/issues/32
> 
> The issue was closed without being fixed.

Hmm, that might be technically correct, but still rather unsatisfactory. I would say we simple throw away any error message. Not pretty, but I don't see what other options we have left.
Comment 3 Michael Tremer 2024-04-08 17:53:22 UTC
@adolf: Would you like to have a look at this? I would really like a more aesthetically pleasing boot process :)
Comment 4 Adolf Belka 2024-04-11 08:06:37 UTC
I have already looked at this several times during 2023 without any success.

This is basically because a kernel change meant that messages that were ignored previously when something was found to not exist changed to messages that were always now shown.

In iproute2 there was also a table dump command that also showed messages that something could not be dumped because it did not exist, even if it not existing was what was required.

The developers created a patch to fix that message from the dump command.

This bug occurs because somewhere when we are starting up a flush of the iproute tables is carried out and the FIB table is flagged up as not existing even though it is not intended to exist.

I tried taking the patch for the dump command and applying it to the flush command "ignore_ENOENT_during_save_if_RT_TABLE_MAIN_is_being_flushed" and did a build with that patch applied but nothing changed.

Either I applied it to the wrong place, but I could not find anywhere else related to the flush of the tables, or the patch is incorrect, which could be the case as I just did a mimic apply to the flush command as was done in the dump command. I did not understand what the patch code was doing.


I also have tried editing the /etc/iproute2/rt_tables file to comment out some of the tables but the result of that was FIB table does not exist was still shown every time when booting. Probably the changes I made to the rt_tables file would have had other consequences anyway.

So during 2023 I have probably tried 3 or 4 different approaches, every time I have a thought about an approach to try and stop the message being shown, but have been 100% unsuccessful. Every attempt has always resulted in the message still being shown.

I have tried searching but no one else seems to have this problem, except if they expected the FIB table to be present and it wasn't, so the fixes were always about how to ensure the FIB table was present.

Maybe the only solution is to actually create an FIB table, even though we don't use it but I don't know if that would have any unintended consequences and it seems counter intuitive to create a table we don't need or use to stop a message about the table not being present.

I am more than willing to have a go to fix this but I have run out of ideas on how to tackle this. If someone else has any suggestions for how to approach it I am willing to give it a go.
Comment 5 Adolf Belka 2024-04-11 08:30:17 UTC
Writing comment 4 made me have some further thoughts and I have found the following.

The error message occurs when a table that is specified is asked to be shown and is empty. The FIB message is not related to a specific table, it occurs whenever any specified table is defined and empty.

The default table numbered 253 is a reserved table and is empty and therefore shows the message when trying to show that table.
You can't comment the default table out as it is a reserved value so I suspect it is always present. However in IPFire the default table is left empty.

We also add in a static table numbered 200 into the rt_tables file as part of the installation.

If no static routes are defined then this table is empty and so also gives the FIB table does not exist message.
If the static table is commented out and you ask to show the table then you get a different error message saying that the table id value for static is invalid. Trying to show table 200, even if the table is commented out in the rt_tables file still gives the FIB table does not exist message.

So creating a default table, that for instance matched the main table, might work for the default table but the static table should only be used if static routes have been defined.

Maybe we need to add the "200   static" line into rt_tables only when a static route is defined in the WUI, and remove the line when any static routes are disabled and hence removed from the table.

I will have a look at trying the above out, ie duplicate the main table into the default table and remove the static line completely and see if that results in a boot without the message.
Comment 6 Adolf Belka 2024-04-11 13:41:38 UTC
Incredibly enough I have been able to stop that "FIB table does not exist" message from showing up.

I decided my thoughts in the last couple of comments were not the right approach.

I searched and looked at the various ip route commands in the IPFire code.

I noticed that some fed the output of the command into null while others did not.

I then went through all the occurrences that did not redirect the output into null and added the redirect.

This then stopped the message occurring. A little bit of checking was able to identify that the use of ip route without redirecting to null in the ipsec-interfaces code was the one causing the message to occur.

I also noticed that there are some ip rule commands which have a linkage to the ip route commands.

Making those redirect to null did not fix the "FIB table does not exist" message but it did turn out to fix the "RTNETLINK answers: no such file or directory". This turned out to be an ip rule command in ipsec-interfaces that was not redirected to null.

I will submit a patch where I will redirect to null all the ip route and ip rule commands where the output is not used as an input into a variable for use elsewhere in the code.

Then you can review my patch proposal to make sure that what  am proposing is a reasonable approach and solution.
Comment 7 Adolf Belka 2024-04-11 15:03:27 UTC
Patch set submitted

https://patchwork.ipfire.org/project/ipfire/list/?series=4258
Comment 8 Adolf Belka 2024-04-20 08:07:03 UTC
Patch set has been merged into next.

https://patchwork.ipfire.org/project/ipfire/list/?series=4258
Comment 9 Michael Tremer 2024-04-24 15:02:41 UTC
I think the patch set overall isn't a good idea and I don't think it should have been merged. We keep merging things too quickly recently without some proper review...

The reason why I think this isn't a good idea simply is that we throw away all sorts of error messages when we actually want to see what is going wrong. Sometimes at least. So these changes will make debugging harder as we will miss where things might be going wrong.

(In reply to Adolf Belka from comment #4)
> This bug occurs because somewhere when we are starting up a flush of the
> iproute tables is carried out and the FIB table is flagged up as not
> existing even though it is not intended to exist.

Correct.

> I also have tried editing the /etc/iproute2/rt_tables file to comment out
> some of the tables but the result of that was FIB table does not exist was
> still shown every time when booting. Probably the changes I made to the
> rt_tables file would have had other consequences anyway.

That file does not really create the route table. The kernel just knows IDs for the route tables and this file maps the IDs to a human-readable format.

(In reply to Adolf Belka from comment #6)
> This then stopped the message occurring. A little bit of checking was able
> to identify that the use of ip route without redirecting to null in the
> ipsec-interfaces code was the one causing the message to occur.

Yes, this seems to be it, judging by your patch. All those "ip route add ..." commands will create a table with that ID. The only command that complains can be the flush operation, because how could it flush something that does not exist?

> I also noticed that there are some ip rule commands which have a linkage to
> the ip route commands.

Yes, they define which routing table the kernel should be using for what kind of traffic.

> Making those redirect to null did not fix the "FIB table does not exist"
> message but it did turn out to fix the "RTNETLINK answers: no such file or
> directory". This turned out to be an ip rule command in ipsec-interfaces
> that was not redirected to null.

Same thing, how could we send traffic to a routing table that does not exist?

> I will submit a patch where I will redirect to null all the ip route and ip
> rule commands where the output is not used as an input into a variable for
> use elsewhere in the code.

I would propose to revert most of the patch set. I am happy to throw away the output of operations that don't matter if they fail. For example, if we were to flush a table that does not exist, that isn't a problem. Deleting some reference that doesn't exist isn't a problem either as we already got what we want.

But we do want to see any errors if routes cannot be inserted. Otherwise we will waste a lot of time debugging things not being able to find the actual problem that ip is even telling us about.
Comment 10 Adolf Belka 2024-04-24 15:14:39 UTC
No problem to have most of it reverted.

I can understand what you are saying about not wanting to lose critical messages. I was not able to judge what should and what should not be changed. I had expected that there would be some feedback about changes being needed.

From you comment about reverting most of the changes does this mean that the change(s) that are the acceptable ones will stay in place so that I don't need to submit a modified patch set?