Tunnel mesh is (mostly) down

List overview All Threads
Download

newer

older

Linux policy routing explained...

Re: [44net] Tunnel mesh is...

Rob Janssen

3 Jan 2015 3 Jan '15

4:20 a.m.

The tunnel mesh went down today at about 08:20 UTC. Most of our routes have disappeared, are no longer being advertised on RIP. The portal.ampr.org site is not responding anymore.

It looks like the portal is no longer distributing correct information to the RIP server and so the RIP server sends incomplete broadcasts and the ampr-ripd deletes the routes.

Is there some fallback scenario, e.g. loading a last correct list of routes by the RIP server to make the network come back up in the state it was just before the mishap?

Rob

Show replies by date

SP2L

3 Jan 3 Jan

4:28 a.m.

Hello Rob.

Just logged to portal.ampr.org - seems to be O.K. Last encap.txt file generated by my ampr-ripd program dated Jan 3-rd 2015 10:20 LT, length 21710 - looks O.K. as well.

Best regards.

-- Tom - SP2L (ex sp2lob) ------------------------------------ It is nice to be important. But it is more important to be nice!

Arno Verhoeven

6:30 a.m.

On 03-01-15 10:20, Rob Janssen wrote:

...

(Please trim inclusions from previous messages) _______________________________________________ The tunnel mesh went down today at about 08:20 UTC. Most of our routes have disappeared, are no longer being advertised on RIP.

Maybe temporary and already resolved, but if I look now I have what seems a full set of routes.

...

Is there some fallback scenario, e.g. loading a last correct list of routes by the RIP server to make the network come back up in the state it was just before the mishap?

Rotate /var/lib/ampr-ripd/encap.txt (use logrotate?) and manually select the most recent, but intact, encap file to reinstate routes?

//Arno

Eric Fort

11:35 a.m.

how about eliminating this issue perminantly from ever happening and moving to voluntary peering between gateways. Know thy neighbor and be responsible foryourpeers androutesseems to work really well for everyone else yet amprnet still relies upon route distribution from a single source.

Eric AF6EP

On 1/3/15, Arno Verhoeven pe1icq@vrhvn.nl wrote:

...

(Please trim inclusions from previous messages) _______________________________________________ On 03-01-15 10:20, Rob Janssen wrote:

...
(Please trim inclusions from previous messages) _______________________________________________ The tunnel mesh went down today at about 08:20 UTC. Most of our routes have disappeared, are no longer being advertised on RIP.

Maybe temporary and already resolved, but if I look now I have what seems a full set of routes.

...
Is there some fallback scenario, e.g. loading a last correct list of routes by the RIP server to make the network come back up in the state it was just before the mishap?

Rotate /var/lib/ampr-ripd/encap.txt (use logrotate?) and manually select the most recent, but intact, encap file to reinstate routes?

//Arno _________________________________________ 44Net mailing list 44Net@hamradio.ucsd.edu http://hamradio.ucsd.edu/mailman/listinfo/44net

Brian

11:45 a.m.

Eric;

On Sat, 2015-01-03 at 08:35 -0800, Eric Fort wrote:

...

how about eliminating this issue perminantly from ever happening and moving to voluntary peering between gateways. Know thy neighbor and be responsible foryourpeers androutesseems to work really well for everyone else yet amprnet still relies upon route distribution from a single source.

Are you suggesting something such as a possible BGPv2 that all gateways or designated regional gateways could perhaps tunnel broadcast between themselves? This would be interesting and may help with other route issues between those using RIPv2 <-> BGP amprnet sites.

-- If Microsoft intended Windows to be for ham usage, they would have incorporated our protocols into their kernel. 73 de Brian Rogers - N1URO email: n1uro@n1uro.ampr.org Web: http://www.n1uro.net/ Ampr1: http://n1uro.ampr.org/ Ampr2: http://nos.n1uro.ampr.org Linux Amateur Radio Services axMail-Fax & URONode AmprNet coordinator for: Connecticut, Delaware, Maine, Maryland, Massachusetts, New Hampshire, Pennsylvania, Rhode Island, and Vermont.

Eric Fort

12:08 p.m.

On 1/3/15, Brian n1uro@n1uro.ampr.org wrote:

...

(Please trim inclusions from previous messages) _______________________________________________ Eric;

On Sat, 2015-01-03 at 08:35 -0800, Eric Fort wrote:

...
how about eliminating this issue perminantly from ever happening and moving to voluntary peering between gateways. Know thy neighbor and be responsible foryourpeers androutesseems to work really well for everyone else yet amprnet still relies upon route distribution from a single source.

Are you suggesting something such as a possible BGPv2 that all gateways or designated regional gateways could perhaps tunnel broadcast between themselves? This would be interesting and may help with other route issues between those using RIPv2 <-> BGP amprnet sites.

for the most part yes. even if just the regional gateways peered and the smaller nodes then connected to one or more of the regional gateways by arangement with a given node op we could then escentially get rid ofdistributing ampr.txt and dynamic routing issues then get handled and resolved between peers. the routing protocol used wouldn't even have to be BGP in all cases though that's likely most appropriate if one considers each node op to basicly be an AS unto themselves. each peer could decide what protocol was appropriate to distribute routes it knows to it's other ajacent peers. peerswould also be free to select the (vpn) tunnel protocol of their choice. By using standards based methods which are widely deployed as common practice across all users of the IP protocols we stand to gain personel and resources that can assist in growing and developing our part of the amateur radio communications hobby.

Eric AF6EP

...

-- If Microsoft intended Windows to be for ham usage, they would have incorporated our protocols into their kernel.

73 de Brian Rogers - N1URO email: n1uro@n1uro.ampr.org Web: http://www.n1uro.net/ Ampr1: http://n1uro.ampr.org/ Ampr2: http://nos.n1uro.ampr.org Linux Amateur Radio Services axMail-Fax & URONode AmprNet coordinator for: Connecticut, Delaware, Maine, Maryland, Massachusetts, New Hampshire, Pennsylvania, Rhode Island, and Vermont.

44Net mailing list 44Net@hamradio.ucsd.edu http://hamradio.ucsd.edu/mailman/listinfo/44net

Marius Petrescu

12:16 p.m.

Connecting via tunnels to a few (2+) neighbours and running BGP with private AS INSIDE ampr would be an answer. It must not necessarily be public BGP routing. This works neatly. Except the whole IPIP stuff, I have at the moment a l2tp tunnel to the German hamnet, and one PPtP to Luxembourg, both doing BGP dynamic routing (again, these are private AS, for route frowarding inside the ampr space ONLY).

But for this to happen, everyone must allow forwarding through their gateways. And here I have my doubts.

MArius, YO2LOJ

Michael E Fox - N6MEF

1:08 p.m.

Well, to be clear, the "mesh" is only down if you depend on a single source of information and source is having a problem.

This is why I don't use RIP. It's a single point of failure. The "mesh" is not.

I use the munge scripts I got from Bob Tenty (modified for my own preferences.). I run the script every few hours. I just don't need updates any quicker than that.

I simply added a couple of sanity checks to the munge scripts. One such check is to pass the resulting table through wc to check that the number of routes coming in is not lower than expected. If that is the case, I keep the existing routes.

BTW, my intention is not to be arrogant, but to point out that every so often, someone suggests a solution of hubbing through some number of BGP sites all of which introduce a single point of failure of a different type for those dependent on that hub.

Michael N6MEF

...

-----Original Message----- From: 44net-bounces+n6mef=mefox.org@hamradio.ucsd.edu [mailto:44net- bounces+n6mef=mefox.org@hamradio.ucsd.edu] On Behalf Of Rob Janssen Sent: Saturday, January 03, 2015 1:20 AM To: 44net@hamradio.ucsd.edu Subject: [44net] Tunnel mesh is (mostly) down

(Please trim inclusions from previous messages) _______________________________________________ The tunnel mesh went down today at about 08:20 UTC. Most of our routes have disappeared, are no longer being advertised on RIP. The portal.ampr.org site is not responding anymore.

It looks like the portal is no longer distributing correct information to the RIP server and so the RIP server sends incomplete broadcasts and the ampr-ripd deletes the routes.

Is there some fallback scenario, e.g. loading a last correct list of routes by the RIP server to make the network come back up in the state it was just before the mishap?

Rob _________________________________________ 44Net mailing list 44Net@hamradio.ucsd.edu http://hamradio.ucsd.edu/mailman/listinfo/44net

Jerald A DeLong

1:28 p.m.

There is no reason both cant coexist.

Jerry, KD4YAL

Michael E Fox - N6MEF

1:50 p.m.

True. But the point is, people have been proposing a system of local BGP hubs as a solution to the problem of a single point of failure. But it's not a solution. It just moves the problem.

Also, people bemoan the "proprietary" nature of what we're doing. But if you think about most large commercial enterprises, they have their own private VPN routes they're not dependent on anyone else for that either. The modern catch-phrase, particularly in the LAN, is SDN (software defined networking). Sure, it's wrapped in sophisticated GUIs and such, but in reality, it's not all that different. In fact, on some levels, "SDN" is not all that different than the old IBM SNA VTAM gens!

The key is this: Right now, when my packets leave my site, I don't have to worry about them routing through some intermediate site belonging to a ham who probably doesn't monitor 24x7 and who might be at work, sick, on vacation, watching the Superbowl, busy with "honey-do's", or just not feeling like working on the "hobby" at the moment.

If there's a way to use standard protocols to make what we have more dynamic without introducing single points of failure, then that would definitely be the way to go. But we shouldn't weaken what is currently very stable (if you don't rely on a single point of failure).

Aside: For the RIP function, perhaps there's a way to introduce a method for RIP to detect incomplete updates. For example, perhaps the program could be augmented to allow the user to define a list of important routes that can't be deleted and, if missing from the update, would mean there's something wrong with the update. Just a random thought.

Michael

...

-----Original Message----- From: Jerald A DeLong [mailto:kd4yal@tampabay.rr.com]

There is no reason both cant coexist.

Jerry, KD4YAL

Mitch Winkle

3:28 p.m.

On Sat, Jan 3, 2015 at 1:50 PM, Michael E Fox - N6MEF n6mef@mefox.org wrote:

...

(Please trim inclusions from previous messages) _______________________________________________ ... In fact, on some levels, "SDN" is not all that different than the old IBM SNA VTAM gens! ...

Oh my. I just got flashbacks and a cold shiver up my spine.

Marius Petrescu

9:07 p.m.

There is a simple method for this. Define your important routes as static routes and run ampr-ripd. They won't be deleted if their RIP counterparts dissapear since ampr-ripd is able to delete only the routes set by itself (the ones that show up as proto 44 under 'ip route list').

Marius, YO2LOJ

-----Original Message----- From: 44net-bounces+marius=yo2loj.ro@hamradio.ucsd.edu [mailto:44net-bounces+marius=yo2loj.ro@hamradio.ucsd.edu] On Behalf Of Michael E Fox - N6MEF Sent: Saturday, January 03, 2015 20:51 To: kd4yal@tampabay.rr.com; 'AMPRNet working group' Subject: Re: [44net] Tunnel mesh is (mostly) down

...

Michael

Michael E Fox - N6MEF

9:45 p.m.

Ah, <forehead slap>, of course. A true Homer Simpson "Doh!" moment.

But based on the fact that this has happened twice in the last few months, it seems it might also be useful to add some other fault detection/prevention options to ampr-ripd. For example, if the number of routes received in the last update is smaller by some value X than the number of existing routes, then there's likely something wrong. Don't take that as a specific suggestion. It would take some thought to figure out how/what to detect and also what to do about it once detected (ignore the new update, send an email, etc.). Again, just some ideas.

Michael N6MEF

...

-----Original Message----- There is a simple method for this. Define your important routes as static routes and run ampr-ripd. They won't be deleted if their RIP counterparts dissapear since ampr-ripd is able to delete only the routes set by itself (the ones that show up as proto 44 under 'ip route list').

Marius, YO2LOJ

Antonio Querubin

1:50 p.m.

On Sat, 3 Jan 2015, Michael E Fox - N6MEF wrote:

...

This is why I don't use RIP. It's a single point of failure. The "mesh" is not.

Indeed, the mesh is a multiple point of failure. It creates static blackholes from EVERY participating gateway to any network that's distributing its routes some other more dynamic way.

Antonio Querubin e-mail: tony@lavanauts.org xmpp: antonioquerubin@gmail.com

John Wiseman

4 Jan 4 Jan

4:09 a.m.

Wouldn't the simplest solution be to modify the rip44 process so it doesn't delete routes that haven't been announced for a while, or at least for a much longer period?

IPIP tunnels and RIP have the major advantage that they allow those who have a dynamic IP address to participate in net44. I feel it is important that we remember that we are radio hams first, and should use solutions that can be used by the majority of hams, not just those network professionals who want to play a being an ISP.

73, John G8BPQ

-----Original Message----- From: 44net-bounces+john.wiseman=cantab.net@hamradio.ucsd.edu [mailto:44net-bounces+john.wiseman=cantab.net@hamradio.ucsd.edu] On Behalf Of Rob Janssen Sent: 03 January 2015 09:20 To: 44net@hamradio.ucsd.edu Subject: [44net] Tunnel mesh is (mostly) down

(Please trim inclusions from previous messages) _______________________________________________ The tunnel mesh went down today at about 08:20 UTC. Most of our routes have disappeared, are no longer being advertised on RIP. The portal.ampr.org site is not responding anymore.

It looks like the portal is no longer distributing correct information to the RIP server and so the RIP server sends incomplete broadcasts and the ampr-ripd deletes the routes.

Is there some fallback scenario, e.g. loading a last correct list of routes by the RIP server to make the network come back up in the state it was just before the mishap?

Rob _________________________________________ 44Net mailing list 44Net@hamradio.ucsd.edu http://hamradio.ucsd.edu/mailman/listinfo/44net

Marc, LX1DUC

9:46 a.m.

On 2015-01-04 10:09, John Wiseman wrote:

...

(Please trim inclusions from previous messages) _______________________________________________ Wouldn't the simplest solution be to modify the rip44 process so it doesn't delete routes that haven't been announced for a while, or at least for a much longer period?

IPIP tunnels and RIP have the major advantage that they allow those who have a dynamic IP address to participate in net44. I feel it is important that we remember that we are radio hams first, and should use solutions that can be used by the majority of hams, not just those network professionals who want to play a being an ISP.

Dynamic endpoints are very well supported using other tunnels (e.g. PPTP, L2TP, etc), although a full mesh would be difficult to setup, but BGP will work just fine via VPNs. Of course 1 endpoint would still need to have a static IP or at least a static hostname.

In Luxembourg, the LARU currently operates a PPTP tunnel to Romania, a GRE tunnel to Hawaii, and inside our network we are using GRE, SSTP, IPIP and PPTP tunnels to connect different sites. We run BGP on top of these tunnels.

Currently a befriended non-profit organization provides Internet access and announces our networks via BGP into the commercial internet.

For 2015, with the arrival of new hardware, we have planned to extent our "direct" peerings with other BGP-capable AMPR sites and "BGP-hosting" for AMPR networks that would like to experiment with BGP.

73 de Marc

4226

Age (days ago)

4227

Last active (days ago)

44net@mailman.ampr.org

15 comments

12 participants

tags (0)

participants (12)

Antonio Querubin
Arno Verhoeven
Brian
Eric Fort
Jerald A DeLong
John Wiseman
Marc, LX1DUC
Marius Petrescu
Michael E Fox - N6MEF
Mitch Winkle
Rob Janssen
SP2L