I've never found the cause of this; the rip sender would report protocol buffer overflows and never has. The UCSD network can easily take the amount of traffic that the rip transmissions constitute, and is not rate limited, so I am led to the conclusion that the loss must be along the transit path somewhere. Of course, being UDP inside IP, we don't get notified of in-transit packet drops so it remains a mystery why this is/was happening.
I had considered using TCP connections to transmit the data for reliability but didn't want the time and overhead of establishing a fresh connection to each gateway system and then closing it, once for each gateway every five minutes. Starting up and knocking down TCP takes time. The delay waiting for a down gateway connection to time out would be a killer. And make no mistake, lots of gateways are offline at any particular time.
Nor did long-duration connections seem practical; the amount of program logic and memory to maintain and re-establish them is non-trivial, and I'm not sure it's practical to maintain three to four hundred TCP connections for days or months at a time on a small machine like amprgw. - Brian
On Wed, Mar 09, 2016 at 01:40:46PM +0100, Rob Janssen wrote:
Some time ago I was hunting a problem where I apparently had packet loss in RIP transmissions, likely because they are sent in a big burst and may overflow buffers or exceed rate limits somewhere. In that case I was randomly losing routes. I kind of fixed that by increasing the timeout on routes to two hours (default was much lower, 15 minutes I think).