Well, after a week of frantic searching how this problem came about, it has finally been
pinpointed.
It turned out to be an unfortunate consequence of someone setting up a bad gateway entry
with
the external IP within out 44.137.0.0/16 network, combined with strict firewalling that
returns ICMP
messages on bad traffic, a bug in the handling of those messages, and the way ampr-ripd
handles
the above type of gateways.
Finally when Marius discovered that ICMP messages were going back to the gateway in his
test
setup, and when this happened amprgw abruptly ended the sequence of RIP packets, combined
with the observation that putting delay between the packets actually made things worse,
and
removing of the bad gateway fixed it, finally lead to resolution of the problem.
Now there are some bugs to fix, but also I looked in the gateway list and I found several
more of
those bad gateways!
It looks like the gateway concept is difficult enough for people to understand as it
originally was,
and adding the possibility to have a gateway address inside net-44 really pushes
complexity beyond
what can be handled.
And because it is (under strict conditions) allowed, the check in the gateway that simply
rejects
such entries has been removed. Look at what we have now:
44.2.11.1 2018-01-17 20:55:30 KK6NHN View
<https://portal.ampr.org/gateways_list.php?a=view&id=1555>
44.2.11.2 2018-01-17 20:56:07 KK6NHN View
<https://portal.ampr.org/gateways_list.php?a=view&id=1556>
44.24.172.41 2018-01-04 23:09:44 KG7PNQ View
<https://portal.ampr.org/gateways_list.php?a=view&id=1546>
44.24.241.98 2017-04-26 06:31:05 KD7LXL View
<https://portal.ampr.org/gateways_list.php?a=view&id=1378>
44.94.17.129 2018-09-04 05:12:02 N0NCE View
<https://portal.ampr.org/gateways_list.php?a=view&id=1695>
44.102.222.16 2019-03-24 22:27:29 W8HRV View
<https://portal.ampr.org/gateways_list.php?a=view&id=1760>
44.118.1.1 2018-03-01 18:16:04 N1QX View
<https://portal.ampr.org/gateways_list.php?a=view&id=1403>
44.118.1.2 2018-03-01 18:16:12 N1QX View
<https://portal.ampr.org/gateways_list.php?a=view&id=1404>
44.118.1.3 2018-03-01 18:16:19 N1QX View
<https://portal.ampr.org/gateways_list.php?a=view&id=1405>
44.130.104.1 2017-10-02 18:34:13 DG8NGN View
<https://portal.ampr.org/gateways_list.php?a=view&id=1481>
44.130.104.2 2018-04-15 16:36:26 DL2QT View
<https://portal.ampr.org/gateways_list.php?a=view&id=1613>
44.130.105.1 2017-10-02 18:35:03 DG8NGN View
<https://portal.ampr.org/gateways_list.php?a=view&id=1482>
44.130.106.1 2017-10-02 18:35:32 DG8NGN View
<https://portal.ampr.org/gateways_list.php?a=view&id=1483>
44.130.107.1 2017-10-02 18:35:55 DG8NGN View
<https://portal.ampr.org/gateways_list.php?a=view&id=1484>
44.131.14.253 2017-05-06 07:29:35 M6XCV View
<https://portal.ampr.org/gateways_list.php?a=view&id=1382>
44.136.150.2 2019-03-11 01:55:46 VK4AA View
<https://portal.ampr.org/gateways_list.php?a=view&id=1386>
44.137.8.1 2018-10-01 09:41:51 PA0HWB View
<https://portal.ampr.org/gateways_list.php?a=view&id=1707>
44.137.44.118 2018-01-14 23:25:39 PD2LED View
<https://portal.ampr.org/gateways_list.php?a=view&id=1551>
44.137.48.78 2019-03-01 09:26:45 PE1OWG View
<https://portal.ampr.org/gateways_list.php?a=view&id=1806>
44.144.46.240 2017-11-07 13:21:48 ON7AVC View
<https://portal.ampr.org/gateways_list.php?a=view&id=1505>
44.151.38.60 2018-05-13 11:24:03 F4FLQ View
<https://portal.ampr.org/gateways_list.php?a=view&id=1631>
44.151.38.61 2018-05-13 11:25:42 F4FLQ View
<https://portal.ampr.org/gateways_list.php?a=view&id=1632>
44.151.38.62 2018-05-13 11:26:32 F4FLQ View
<https://portal.ampr.org/gateways_list.php?a=view&id=1633>
44.159.112.16 2018-08-17 19:28:16 9V1AN View
<https://portal.ampr.org/gateways_list.php?a=view&id=1683>
Most of them have "no subnets" and I think they were only created by
experimenting people who tried
some options in the gateway and finally gave up for the day.
Only these actually have subnets routed:
44.24.172.41
44.94.17.129
44.130.104.1
44.130.105.1
44.130.106.1
44.130.107.1
44.131.14.253
44.136.150.2
Several of those route a subnet that contains the actual gateway... often subnets so small
that they cannot
be BGP-routed on internet, so likely invalid as well.
This whole thing causes quite some risk of malfunction, loops, etc. as was demonstrated
this week.
I propose that the existing gateways in the above list with no gateways (those not in the
second list) to
be removed, and the portal changed to again disallow the creation of new gateways within
net-44.
Then, the valid users should be encouraged to find a gateway IP outside of net-44 and this
whole thing be phased out.
Too much time has been wasted on debugging this problem already, and what advantage does
it really bring?
Rob