For a little while, when the system boots up, there is a message that shouldn't be there which says: "FIB table does not exist" I didn't investigate where it is coming from, but I think it is confusing and probably not a problem, but has alarmed a couple of users. @Peter: Would you like to grab this one?
I found the following issue in iproute2 github https://github.com/shemminger/iproute2/issues/32 The issue was closed without being fixed.
(In reply to Adolf Belka from comment #1) > I found the following issue in iproute2 github > > https://github.com/shemminger/iproute2/issues/32 > > The issue was closed without being fixed. Hmm, that might be technically correct, but still rather unsatisfactory. I would say we simple throw away any error message. Not pretty, but I don't see what other options we have left.
@adolf: Would you like to have a look at this? I would really like a more aesthetically pleasing boot process :)
I have already looked at this several times during 2023 without any success. This is basically because a kernel change meant that messages that were ignored previously when something was found to not exist changed to messages that were always now shown. In iproute2 there was also a table dump command that also showed messages that something could not be dumped because it did not exist, even if it not existing was what was required. The developers created a patch to fix that message from the dump command. This bug occurs because somewhere when we are starting up a flush of the iproute tables is carried out and the FIB table is flagged up as not existing even though it is not intended to exist. I tried taking the patch for the dump command and applying it to the flush command "ignore_ENOENT_during_save_if_RT_TABLE_MAIN_is_being_flushed" and did a build with that patch applied but nothing changed. Either I applied it to the wrong place, but I could not find anywhere else related to the flush of the tables, or the patch is incorrect, which could be the case as I just did a mimic apply to the flush command as was done in the dump command. I did not understand what the patch code was doing. I also have tried editing the /etc/iproute2/rt_tables file to comment out some of the tables but the result of that was FIB table does not exist was still shown every time when booting. Probably the changes I made to the rt_tables file would have had other consequences anyway. So during 2023 I have probably tried 3 or 4 different approaches, every time I have a thought about an approach to try and stop the message being shown, but have been 100% unsuccessful. Every attempt has always resulted in the message still being shown. I have tried searching but no one else seems to have this problem, except if they expected the FIB table to be present and it wasn't, so the fixes were always about how to ensure the FIB table was present. Maybe the only solution is to actually create an FIB table, even though we don't use it but I don't know if that would have any unintended consequences and it seems counter intuitive to create a table we don't need or use to stop a message about the table not being present. I am more than willing to have a go to fix this but I have run out of ideas on how to tackle this. If someone else has any suggestions for how to approach it I am willing to give it a go.
Writing comment 4 made me have some further thoughts and I have found the following. The error message occurs when a table that is specified is asked to be shown and is empty. The FIB message is not related to a specific table, it occurs whenever any specified table is defined and empty. The default table numbered 253 is a reserved table and is empty and therefore shows the message when trying to show that table. You can't comment the default table out as it is a reserved value so I suspect it is always present. However in IPFire the default table is left empty. We also add in a static table numbered 200 into the rt_tables file as part of the installation. If no static routes are defined then this table is empty and so also gives the FIB table does not exist message. If the static table is commented out and you ask to show the table then you get a different error message saying that the table id value for static is invalid. Trying to show table 200, even if the table is commented out in the rt_tables file still gives the FIB table does not exist message. So creating a default table, that for instance matched the main table, might work for the default table but the static table should only be used if static routes have been defined. Maybe we need to add the "200 static" line into rt_tables only when a static route is defined in the WUI, and remove the line when any static routes are disabled and hence removed from the table. I will have a look at trying the above out, ie duplicate the main table into the default table and remove the static line completely and see if that results in a boot without the message.
Incredibly enough I have been able to stop that "FIB table does not exist" message from showing up. I decided my thoughts in the last couple of comments were not the right approach. I searched and looked at the various ip route commands in the IPFire code. I noticed that some fed the output of the command into null while others did not. I then went through all the occurrences that did not redirect the output into null and added the redirect. This then stopped the message occurring. A little bit of checking was able to identify that the use of ip route without redirecting to null in the ipsec-interfaces code was the one causing the message to occur. I also noticed that there are some ip rule commands which have a linkage to the ip route commands. Making those redirect to null did not fix the "FIB table does not exist" message but it did turn out to fix the "RTNETLINK answers: no such file or directory". This turned out to be an ip rule command in ipsec-interfaces that was not redirected to null. I will submit a patch where I will redirect to null all the ip route and ip rule commands where the output is not used as an input into a variable for use elsewhere in the code. Then you can review my patch proposal to make sure that what am proposing is a reasonable approach and solution.
Patch set submitted https://patchwork.ipfire.org/project/ipfire/list/?series=4258
Patch set has been merged into next. https://patchwork.ipfire.org/project/ipfire/list/?series=4258
I think the patch set overall isn't a good idea and I don't think it should have been merged. We keep merging things too quickly recently without some proper review... The reason why I think this isn't a good idea simply is that we throw away all sorts of error messages when we actually want to see what is going wrong. Sometimes at least. So these changes will make debugging harder as we will miss where things might be going wrong. (In reply to Adolf Belka from comment #4) > This bug occurs because somewhere when we are starting up a flush of the > iproute tables is carried out and the FIB table is flagged up as not > existing even though it is not intended to exist. Correct. > I also have tried editing the /etc/iproute2/rt_tables file to comment out > some of the tables but the result of that was FIB table does not exist was > still shown every time when booting. Probably the changes I made to the > rt_tables file would have had other consequences anyway. That file does not really create the route table. The kernel just knows IDs for the route tables and this file maps the IDs to a human-readable format. (In reply to Adolf Belka from comment #6) > This then stopped the message occurring. A little bit of checking was able > to identify that the use of ip route without redirecting to null in the > ipsec-interfaces code was the one causing the message to occur. Yes, this seems to be it, judging by your patch. All those "ip route add ..." commands will create a table with that ID. The only command that complains can be the flush operation, because how could it flush something that does not exist? > I also noticed that there are some ip rule commands which have a linkage to > the ip route commands. Yes, they define which routing table the kernel should be using for what kind of traffic. > Making those redirect to null did not fix the "FIB table does not exist" > message but it did turn out to fix the "RTNETLINK answers: no such file or > directory". This turned out to be an ip rule command in ipsec-interfaces > that was not redirected to null. Same thing, how could we send traffic to a routing table that does not exist? > I will submit a patch where I will redirect to null all the ip route and ip > rule commands where the output is not used as an input into a variable for > use elsewhere in the code. I would propose to revert most of the patch set. I am happy to throw away the output of operations that don't matter if they fail. For example, if we were to flush a table that does not exist, that isn't a problem. Deleting some reference that doesn't exist isn't a problem either as we already got what we want. But we do want to see any errors if routes cannot be inserted. Otherwise we will waste a lot of time debugging things not being able to find the actual problem that ip is even telling us about.
No problem to have most of it reverted. I can understand what you are saying about not wanting to lose critical messages. I was not able to judge what should and what should not be changed. I had expected that there would be some feedback about changes being needed. From you comment about reverting most of the changes does this mean that the change(s) that are the acceptable ones will stay in place so that I don't need to submit a modified patch set?
(In reply to Adolf Belka from comment #10) > No problem to have most of it reverted. > > I can understand what you are saying about not wanting to lose critical > messages. I was not able to judge what should and what should not be > changed. I had expected that there would be some feedback about changes > being needed. > > From you comment about reverting most of the changes does this mean that the > change(s) that are the acceptable ones will stay in place so that I don't > need to submit a modified patch set? Correct. I have just submitted a patch set that has the changes that I would like: > https://patchwork.ipfire.org/project/ipfire/list/?series=4279 It looks large, but I swear that the most important parts are not reverted :)
I have had a look at that patch set and it looks fine to me. I will test out my original patch set in CU186 because that should still work. Once you have added your reversions I will do a fresh update from CU185 to CU186 to confirm that everything still works as expected.
CU186 Testing has been issued. https://www.ipfire.org/blog/ipfire-2-29-core-update-186-is-available-for-testing
After rebooting to CU186, the messages "FIB table does not exist" and "RTNETLINK answers: no such file or directory" Were both no longer shown. It only took 2.3 years to get solved but maybe better now than never.
Core Update 186 Testing has been issued. https://www.ipfire.org/blog/ipfire-2-29-core-update-186-is-available-for-testing I did an upgrade from a CU185 vm to CU186 Testing. This verifies that the issues raised in this bug have been fixed.
Core Update 186 has been released https://www.ipfire.org/blog/ipfire-2-29-core-update-186-released