On Fri, Apr 14, 2017 at 9:02 AM, Brian Kantor Brian@ucsd.edu wrote:
(Please trim inclusions from previous messages) _______________________________________________ During the night, amprgw fetched a copy of the encap file from the portal which was truncated (it had 259 entries instead of over 600) and one of the entries that was in the copy fetched had no gateway address. Viz:
route addprivate 44.131.168.128/29 encap
This caused the ipip daemon to segfault and so there was no routing between Internet and tunnel subnets for an hour. The rip sender was similarly affected.
Later fetches of the encap file obtained ones that appear to be intact and all routing is again operational.
I've modified the code in the daemons to be more robust in the face of a bad encap fetch, and taken steps to ensure that future fetches are checked better. - Brian
Have you considered open sourcing this infrastructure software so we can get more eyes looking for bugs like this?
Tom KD7LXL