Summary: | IPSec Rekey Collisions May Result in IPTable Rule Deletion | ||
---|---|---|---|
Product: | IPFire | Reporter: | Tom Rymes <tomvend> |
Component: | --- | Assignee: | Michael Tremer <michael.tremer> |
Status: | CLOSED FIXED | QA Contact: | Arne.F <arne.fitzenreiter> |
Severity: | Major Usability | ||
Priority: | - Unknown - | ||
Version: | 2 | ||
Hardware: | unspecified | ||
OS: | Linux | ||
Attachments: | strongswan 5.3.5 for IPFire (i586) |
Description
Tom Rymes
2015-12-15 15:12:56 UTC
Do you know if there is a bug report about this upstream? Not that I am aware of, and those list messages are from February and August, if memory serves, so I'm guessing that if this is a StrongSwan issue, they are not fully aware of it. Looks like I spoke too soon. There is this issue: https://wiki.strongswan.org/issues/853 and a duplicate: https://wiki.strongswan.org/issues/959 I don't see any acknowledgment that this is a StrongSwan issue by the devs, but it has been reported. Michael, The Strongswan team has updated that bug with a proposed patch. I am not really certain as to how to test it, though. Issue #853 has been updated by Tobias Brunner. Tracker changed from Issue to Bug Status changed from New to Feedback Please try the patch in the 853-updown-rekey-collision branch. It tries to suppress the updown event during rekeying in case an installed redundant CHILD_SA is deleted. Tom This looks like a plausible solution. Should I build you an IPFire build so that you can test the patch? Can you reproduce the issue easily? Would an ISO image be more helpful or just a tarball with the compiled version of strongswan? To be Honest, Michael, the original poster on the forum was able to reproduce the issue, while I am not really able to do so. I can try to install a tarball and check the log for messages periodically, but that does not seem to be much of a solution. I'll try to reach out to the original poster on the forum, but he seems to have made himself scarce. I'm not certain if this is relevant, but I am not seeing entries like this in my kernel logs: charon: 07[CHD] running updown script: Should I be looking elsewhere? Also, would it make sense to have IPSec place its logs in its own file? Especially with multiple tunnels (we have 20 or so), the kernel log is completely polluted with IPSec entries. (In reply to Tom Rymes from comment #7) > I'm not certain if this is relevant, but I am not seeing entries like this > in my kernel logs: > > charon: 07[CHD] running updown script: This may happen if logging is not configured to log these things. We do not log that much - even if it is already a lot. > Should I be looking elsewhere? > > Also, would it make sense to have IPSec place its logs in its own file? > Especially with multiple tunnels (we have 20 or so), the kernel log is > completely polluted with IPSec entries. We could also consider to turn the log level down. Could you open an extra report for that? Added a new request here: https://bugzilla.ipfire.org/show_bug.cgi?id=11001 I don't think reducing IPSEc logging levels would be a good thing, rather, placing them in their own file (see https://wiki.strongswan.org/projects/strongswan/wiki/LoggerConfiguration) would make them more readable, while simultaneously makig the kernel log significantly easier to parse for non-IPSec items. Michael: I have been able to set up a test installation and can reproduce this issue. If you can send me a tarball with instructions for installation suitable for a five-year-old, I can install it and test to see if it fixes the issue. Tom Created attachment 387 [details]
strongswan 5.3.5 for IPFire (i586)
This is an easy one. Download the attached file and copy it to the firewall system.
Then execute "tar xvfa stronsgwan-5.3.5-for-ipfire.tar.gz -C /" and restart strongswan by executing "ipsec restart".
http://git.ipfire.org/?p=ipfire-2.x.git;a=commitdiff;h=4d7f9a81ac575207edb6bb69f8bbea8762feab96 The changes will also be in the next nightly build. From what I can tell, this has solved the issue. I have had a tunnel up with these settings all day: keylife=3m rekeymargin=1m rekeyfuzz=0% It would normally exhibit the issue almost immediately, but it has been up all day now, with no seeming issues. a snippet from the command 'grep coll /var/log/messages': Dec 19 16:42:02 laconia charon: 05[IKE] detected CHILD_REKEY collision with CHILD_REKEY Dec 19 16:42:03 laconia charon: 11[IKE] CHILD_SA rekey collision lost, deleting rekeyed child Dec 19 16:42:04 laconia charon: 03[IKE] detected CHILD_REKEY collision with CHILD_DELETE Dec 19 16:44:15 laconia charon: 15[IKE] detected CHILD_REKEY collision with CHILD_REKEY Dec 19 16:44:15 laconia charon: 14[IKE] CHILD_SA rekey collision lost, deleting rekeyed child Dec 19 16:44:16 laconia charon: 02[IKE] detected CHILD_REKEY collision with CHILD_DELETE Dec 19 16:46:27 laconia charon: 02[IKE] detected CHILD_REKEY collision with CHILD_REKEY Dec 19 16:46:27 laconia charon: 06[IKE] CHILD_SA rekey collision lost, deleting rekeyed child Dec 19 16:46:28 laconia charon: 05[IKE] detected CHILD_REKEY collision with CHILD_DELETE Dec 19 16:48:39 laconia charon: 10[IKE] detected CHILD_REKEY collision with CHILD_REKEY Dec 19 16:48:39 laconia charon: 11[IKE] CHILD_SA rekey collision won, deleting old child Dec 19 16:50:51 laconia charon: 13[IKE] detected CHILD_REKEY collision with CHILD_DELETE Dec 19 16:53:02 laconia charon: 15[IKE] detected CHILD_REKEY collision with CHILD_REKEY Dec 19 16:53:02 laconia charon: 12[IKE] CHILD_SA rekey collision won, deleting old child Dec 19 16:55:14 laconia charon: 03[IKE] detected CHILD_REKEY collision with CHILD_DELETE Dec 19 16:57:25 laconia charon: 04[IKE] detected CHILD_REKEY collision with CHILD_REKEY Dec 19 16:57:25 laconia charon: 12[IKE] CHILD_SA rekey collision won, deleting old child Dec 19 16:59:37 laconia charon: 05[IKE] detected CHILD_REKEY collision with CHILD_DELETE Dec 19 17:01:48 laconia charon: 15[IKE] detected CHILD_REKEY collision with CHILD_REKEY Dec 19 17:01:48 laconia charon: 04[IKE] CHILD_SA rekey collision won, deleting old child Dec 19 17:04:00 laconia charon: 11[IKE] detected CHILD_REKEY collision with CHILD_DELETE Dec 19 17:06:11 laconia charon: 15[IKE] detected CHILD_REKEY collision with CHILD_REKEY Dec 19 17:06:11 laconia charon: 13[IKE] CHILD_SA rekey collision lost, deleting rekeyed child Dec 19 17:06:12 laconia charon: 02[IKE] detected CHILD_REKEY collision with CHILD_DELETE Similarly, the IPTables chains are still populated: Chain IPSECBLOCK (2 references) target prot opt source destination REJECT all -- anywhere 10.100.0.0/23 reject-with icmp-net-unreachable Chain IPSECINPUT (1 references) target prot opt source destination ACCEPT esp -- mydom.dom other.dom ACCEPT ah -- mydom.dom other.dom ACCEPT ip -- mydom.dom other.dom ACCEPT ip -- mydom.dom other.dom policy match dir in pol ipsec reqid 3 proto esp RETURN all -- 10.100.0.0/23 10.44.0.0/24 policy match dir in pol ipsec reqid 3 proto esp ACCEPT udp -- anywhere anywhere udp dpt:isakmp ACCEPT udp -- anywhere anywhere udp dpt:ipsec-nat-t ACCEPT udp -- anywhere anywhere udp dpt:isakmp ACCEPT udp -- anywhere anywhere udp dpt:ipsec-nat-t Chain IPSECOUTPUT (1 references) target prot opt source destination MARK all -- 10.44.0.0/24 10.100.0.0/23 policy match dir out pol ipsec reqid 3 proto esp MARK set 0x32 Chain IPSECFORWARD (1 references) target prot opt source destination RETURN all -- 10.100.0.0/23 10.44.0.0/24 policy match dir in pol ipsec reqid 3 proto esp MARK all -- 10.44.0.0/24 10.100.0.0/23 policy match dir out pol ipsec reqid 3 proto esp MARK set 0x32 Unfortunately, I was rushing this morning and forgot to update both sides of the tunnel, so the other side had the issue and traffic was not traversing the tunnel when I got back this afternoon, so I have updated the other side and I will confirm that traffic is moving still in a number of hours. Tom (In reply to Tom Rymes from comment #13) > keylife=3m > rekeymargin=1m > rekeyfuzz=0% Are you using these settings just for reproducing the bug? Yes, only to force re-key collisions. I have had the tunnel up and running for just under two hours and all is still fine. It does seem like this fix has worked. Another 12+ hours and hundreds of collisions later it's still working. I will report back to StrongSwan. Alright. Since this is scheduled for release with C96, I will set this to ON_QA. |