On Tue, Jul 21, 2015 at 02:26:36PM -0400, Bryan Fields wrote:
What is the configuration of the UCSD gateway?
I answered that in a previous email earlier today. Again: It's a dual-core 3.2 Ghz Xeon processor with two 1 GbE ports. The port 'em0' is connected to a 1G switch which is in turn connected at 10GbE to the building infrastructure switch/router. Port 'em1' is output-only to the network 'telescope'. The system never swaps or pages.
It does all the packet filtering, selection, and diversion using kernel-mode 'ipfw'. The very few packets which are destined for legitimate AMPR hosts are forwarded and encapsulated by a user-mode program. That program consumes almost no resource because there are so few packets headed to or from legitimate AMPR hosts and that's all it's given to handle.
Statistics and experiments show that the bottleneck is the IP input routines processing the ipfw rules. Since this is single-threaded inside the kernel, more cores over the effective 4 we have now will probably not help. As you can see from the snapshot below, the task queue for the input interface is full and that is where the packets are being dropped.
/0% /10 /20 /30 /40 /50 /60 /70 /80 /90 /100 root em0 taskq XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX root idle: cpu2 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX root idle: cpu3 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX root idle: cpu1 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX root idle: cpu0 X root em1 taskq X root ipipd X
The relevant ipfw rules are these (table 1 contains the list of legitimate hosts derived by ANDing the gateways with the DNS. Socket 4444 is the ipip daemon's input; 192.168.44.252 is the network telescope.)
# known addresses go to the encapsulating router socket: ipipd ipfw add divert 4444 ip from not 10.0.0.0/8,172.16.0.0/12,169.254.0.0/16,192.168.0.0/16 to 'table(1)' in not dst-port 135-139,445,1025-1028 # other 44 addresses go next door for analysis ipfw add forward 192.168.44.252 all from any to 44.0.0.0/8
Turning off the filtering/diversion ('ipfw disable firewall') almost immediately ends the congestion with the em0 taskq sitting below 50% and packets no longer get dropped. Turning it back on resumes the problem. Of course, when it's off, no ipip is processed. - Brian