In our local network we have several different kinds of tunnels, with different header overhead. As the usual MTU on an internet connection is 1500 (the ethernet MTU), the typical MTU for an IPIP tunnel is 1480, for GRE it is 1476, for GRE6 it is 1454, etc.
However, not everyone has a 1500 byte internet MTU. Some people have PPPoE connections to internet with MTU of typically 1492, sometimes 1480. So the effective MTU of the mentioned (and other) tunnel types becomes 8 or 20 bytes less. Some people get a fixed address subnet from their ISP and it is provided as some tunnel with an MTU of 1456 (quite common here). This results in a wide variety of MTU values in our network.
Frequently issues arise for new connections where the chosen MTU for some tunnel turns out to be too large, and full-size packets are dropped. And in an environment where those tunneled packets encounter a point where the outer packet is too large for the interface MTU, the usual mechanism of returning "ICMP destination unreachable, fragmentation required" does not work very well, because the ICMP is returned to the router that encapsulated the packet, not the original source of the traffic. And I have never seen an encapsulating router that translated the ICMP to a new ICMP packet referring to the inner addresses and sent it back to the original source.
Also, there are sometimes issues when routes are changed by BGP. Of course many routers have TCP MSS clamping configured where the TCP MSS is reduced whenever the TCP SYN passes through a place with lower MTU, but this happens only on the initial connection setup. When the MTU later reduces due to a route change, this still results in failure of the connection.
I wonder if other gateway operators have done something to alleviate this problem.
Solutions that can be considered: - ignore DF. much of the current TCP traffic has DF (don't fragment) set, but this often causes communications to unnecessarily break. Without DF, packets would be fragmented as originally designed in the IP protocol. sending everything with DF and interpreting the ICMP responses is the mechanism behind "Path MTU discovery", which was designed to avoid fragmentation and the overhead it causes in routers. however, in the AMPRnet we seldomly encounter so much traffic that CPU loading of the routers is an issue. - standardize on a "default MTU" whenever we cannot offer a 1500 byte MTU. this does not solve all problems, but at least it solves some of them.
Note that most routers fragment packets in a particularly inefficient way. When a packet a few bytes too large for the next hop has to be forwarded (and DF is not set), they will not split the packet in two approximately equal halves, but rather they send a first fragment as large as the outgoing MTU can accept, then a small fragment with the remainder of the original packet. This can result in multiple fragmentations along the way: first it has to be fragmented to fit into a 1480 byte MTU of an IPIP tunnel, then further on it has to be fragmented again to fit a GRE or L2TP/IPsec tunnel with smaller MTU. Whereas no further fragmentation would be required when it had been split in equal halves the first time.
So, I wonder what others do (if anything) to avoid the problems caused by oversized packets and maybe to avoid fragmentation. For some time, I have experimented with "ignore DF" and of course it keeps traffic flowing, but it is unclear if it causes problems for some users. Next I would consider to use a standard MTU value on all tunnels, so there are mostly two MTU values left in the network: 1500 and that smaller, to be determined, value.
Of course the MTU should not be so low that it causes terrible overhead. In the past we had a 256 byte MTU on AX.25 packet radio (or even 216 when it was over NET/ROM), but that causes a 15% header overhead and made us very unpopular amongst plain AX.25 users. Fortunately the WiFi links we use today allow 1500 byte packets :-)
The minimal required MTU for IPv6 is 1280. The maximal MTU we can accomodate with the worst case tunnel headers is about 1400. So the preferable default MTU would be somewhere between 1280 and 1400.
Are people even using 256-byte MTU links today? Would it be worth it to select an MTU value that can be more efficiently fragmented into 256-byte packets? Or is there another small MTU size that would be a candidate for such considerations?
So again, I wonder what others have done w.r.t. this matter. Are admins of gateways that offer many kinds of different tunnels using a standard MTU in their systems, or just the max MTU that each tunnel technology allows? Do you copy DF from the inner to the outer packet in a tunnel? Do you ignore DF? What would be your position on establishing a standard MTU for tunnels, and what size would you propose?
Rob PE1CHL
I have dealt with tunnels in tunnels and broken PMTUD because of ICMP blackhole at $dayjob because of a customer's idea of "security" to block all ICMP.
You have probably found this: https://blog.cloudflare.com/path-mtu-discovery-in-practice/amp/
Obviously we cannot fix client but I tried #3 first. It probably was good solution for Linux but ultimately I put all my public facing servers at 1300 MTU.
For my own tunnels I incorporate MSS clamping but as you found this is something tunnel owner must do and you cannot fix without nasty workaround.
I think ignoring DF will probably fix it all but hide problems. I only have done this briefly to troubleshoot an ipsec link. If 1300 MTU isn't low enough, I feel ok saying the other end needs to help fix the problem (stop blocking icmp!).
Regards, Scott
On Tue, Mar 3, 2020, 1:18 PM Rob Janssen via 44Net 44net@mailman.ampr.org wrote:
In our local network we have several different kinds of tunnels, with different header overhead. As the usual MTU on an internet connection is 1500 (the ethernet MTU), the typical MTU for an IPIP tunnel is 1480, for GRE it is 1476, for GRE6 it is 1454, etc.
However, not everyone has a 1500 byte internet MTU. Some people have PPPoE connections to internet with MTU of typically 1492, sometimes 1480. So the effective MTU of the mentioned (and other) tunnel types becomes 8 or 20 bytes less. Some people get a fixed address subnet from their ISP and it is provided as some tunnel with an MTU of 1456 (quite common here). This results in a wide variety of MTU values in our network.
Frequently issues arise for new connections where the chosen MTU for some tunnel turns out to be too large, and full-size packets are dropped. And in an environment where those tunneled packets encounter a point where the outer packet is too large for the interface MTU, the usual mechanism of returning "ICMP destination unreachable, fragmentation required" does not work very well, because the ICMP is returned to the router that encapsulated the packet, not the original source of the traffic. And I have never seen an encapsulating router that translated the ICMP to a new ICMP packet referring to the inner addresses and sent it back to the original source.
Also, there are sometimes issues when routes are changed by BGP. Of course many routers have TCP MSS clamping configured where the TCP MSS is reduced whenever the TCP SYN passes through a place with lower MTU, but this happens only on the initial connection setup. When the MTU later reduces due to a route change, this still results in failure of the connection.
I wonder if other gateway operators have done something to alleviate this problem.
Solutions that can be considered:
- ignore DF. much of the current TCP traffic has DF (don't fragment)
set, but this often causes communications to unnecessarily break. Without DF, packets would be fragmented as originally designed in the IP protocol. sending everything with DF and interpreting the ICMP responses is the mechanism behind "Path MTU discovery", which was designed to avoid fragmentation and the overhead it causes in routers. however, in the AMPRnet we seldomly encounter so much traffic that CPU loading of the routers is an issue.
- standardize on a "default MTU" whenever we cannot offer a 1500 byte
MTU. this does not solve all problems, but at least it solves some of them.
Note that most routers fragment packets in a particularly inefficient way. When a packet a few bytes too large for the next hop has to be forwarded (and DF is not set), they will not split the packet in two approximately equal halves, but rather they send a first fragment as large as the outgoing MTU can accept, then a small fragment with the remainder of the original packet. This can result in multiple fragmentations along the way: first it has to be fragmented to fit into a 1480 byte MTU of an IPIP tunnel, then further on it has to be fragmented again to fit a GRE or L2TP/IPsec tunnel with smaller MTU. Whereas no further fragmentation would be required when it had been split in equal halves the first time.
So, I wonder what others do (if anything) to avoid the problems caused by oversized packets and maybe to avoid fragmentation. For some time, I have experimented with "ignore DF" and of course it keeps traffic flowing, but it is unclear if it causes problems for some users. Next I would consider to use a standard MTU value on all tunnels, so there are mostly two MTU values left in the network: 1500 and that smaller, to be determined, value.
Of course the MTU should not be so low that it causes terrible overhead. In the past we had a 256 byte MTU on AX.25 packet radio (or even 216 when it was over NET/ROM), but that causes a 15% header overhead and made us very unpopular amongst plain AX.25 users. Fortunately the WiFi links we use today allow 1500 byte packets :-)
The minimal required MTU for IPv6 is 1280. The maximal MTU we can accomodate with the worst case tunnel headers is about 1400. So the preferable default MTU would be somewhere between 1280 and 1400.
Are people even using 256-byte MTU links today? Would it be worth it to select an MTU value that can be more efficiently fragmented into 256-byte packets? Or is there another small MTU size that would be a candidate for such considerations?
So again, I wonder what others have done w.r.t. this matter. Are admins of gateways that offer many kinds of different tunnels using a standard MTU in their systems, or just the max MTU that each tunnel technology allows? Do you copy DF from the inner to the outer packet in a tunnel? Do you ignore DF? What would be your position on establishing a standard MTU for tunnels, and what size would you propose?
Rob PE1CHL _________________________________________ 44Net mailing list 44Net@mailman.ampr.org https://mailman.ampr.org/mailman/listinfo/44net