On Sun, Jan 04, 2015 at 09:23:08PM +0100, Rob Janssen wrote:
The ampr-ripd has a route lifetime of only 600 seconds. Routes are announced every 300 seconds, so when two subsequent announces are incomplete we lose the route. It happened again this morning at 08:05 local (07:05 UTC). My route again was lost, and recovered at 08:10.
I am at somewhat of a loss to explain why this might have happened; the rip sender logged that it was fetching the proper number of subnet routes (428) from the routing database, and generating the proper number of rip packets. No transmission errors were logged at the time you mention.
It is possible that you did not receive all the packets. They are sent as datagrams so there is nothing to retry or notice if one of them goes missing in transit.
Perhaps it would have been smarter to use a connected mode (TCP) to transmit the routing information. We could convert to doing that, with some significant effort.
I agree that making the timeout much longer than 10 minutes is wise. It might also be wise to control for a large delta in routes received. Logging the number of packets and subnet routes received to syslog might provide some additional data if/when this happens again. - Brian