Well, I found and fixed the cause of the packet loss problem.
It turns out that the single-threaded nature of IP processing in the
FreeBSD kernel means that when ipfw was told to forward the packet to
the network telescope, the process blocked for a significant period of
time while the outgoing packet was rewritten and enqueued. This caused
the inbound work queue to lengthen to the point where incoming packets
were ignored and dropped, which played Hob with the throughput for udp
and tcp connections.
With the cooperation of the CAIDA people, we stopped forwarding packets to
the telescope and will instead feed it off a network switch mirror port.
They will filter out our legitimate subnets leaving the IBR that they
want to study.
That this was the cause is shown by now lossless pinging of various end
destinations on AMPRNet. For example,
---
kk7kx.ampr.org ping statistics ---
100 packets transmitted, 100 received, 0% packet loss, time 99120ms
rtt min/avg/max/mdev = 26.457/30.128/189.503/17.459 ms
And is also shown by the reduced task queue on the input interface
/0% /10 /20 /30 /40 /50 /60 /70 /80 /90 /100
root idle: cpu3 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
root idle: cpu2 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
root idle: cpu1 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
root idle: cpu0 XXXXXXXXXXXXXXXXXXXXXXXXX
root em0 taskq XXXXXXXXXXXXXXXXXXXXX
root ipipd X
At the moment, input packet rates are running around 12 MB/s, which the
amprgw seems to be handling easily. It's possible that the remaining
ipfw rules could be optimized somewhat to reduce the input queue even
further, but I think I'll call it a day for now.
- Brian