Chris,
I recall something similar occurring during the lifetime of the late Brian Kantor. Is
AMPRGW still a 10 Gbps BSD kernel?
- Lynwood
If it helps - from the late Brian Kantor, SK -
Fri, Jun 2, 2017 at 5:12 PM
I found a possible source of memory corruption - a hash routinemight have returned a
negative number, causing the flow statsgathering to index off the beginning of a large
array thatappears in memory adjacent to the routing table. This mighthave resulted in an
entry being stomped on. I don't know.
But the hash routine won't return negative or too large numbersanymore. I did fix
that. We'll see if that prevents the problemfrom recurring. I'd hate to have to
go through 16 million routeentries in a core dump. - Brian
On Fri, Jun 02, 2017 at 11:12:16AM -0700, Brian Kantor wrote:> On Fri, Jun 02, 2017 at
02:03:56PM -0400, lleachii--- via 44Net wrote:> > From my perspective, I stop seeing
all inbound Internet traffic from AMPRGW> > to 44.60.44.0/24, except for the
intermittent data to another subnet (now> > 44.62.1.81). I transmit, and never
receive replies.> > Although, I'm still able to send and receive traffic to/from
the other 44> > GWs.>> Thanks, that helps. I checked, and the on-disk copy of
the routing> table is still correct even when this is happening, so I'm
beginning> to suspect memory corruption in the router software itself.>> As you
know, that's difficult to find, but I'm looking through the> code to make sure
I haven't done any of the usual errors. The next> time it happens, I'll take a
core dump of the running process and> see if that tells me anything. Be sure to let me
know.> - Brian