[-explicit Cc to Dave, +Cc to Chris]
On Sun, May 5, 2024 at 8:00 PM lleachii(a)aol.com <lleachii(a)aol.com> wrote:
Yea, odd. I never received the queries that show as
timed out on your output (i.e. never forwarded to me by AMPRGW):
Thanks for confirming, Lynwood. Your results match what I saw on my
AMPRNet gateway, as well.
That two different hosts are showing the same results strengthens the
argument that it's something upstream of both of us; in this case, the
obvious culprit is AMPRGW (though of course it could be something
between my VPS and AMPRGW, but I think that's unlikely for the reasons
I mentioned in my previous email).
Chris, any ideas here?
- Dan C.
root@OpenWrt:~# tcpdump -vvvn -i tunl0 host
44.60.44.3 and host 166.84.136.80
tcpdump: listening on tunl0, link-type RAW (Raw IP), snapshot length 262144 bytes
23:44:17.337340 IP (tos 0x28, ttl 49, id 27212, offset 0, flags [DF], proto TCP (6),
length 64)
166.84.136.80.25518 > 44.60.44.3.53: Flags [S], cksum 0xe682 (correct), seq
686979005, win 16384, options [mss 1460,nop,nop,sackOK,nop,wscale 6,nop,nop,TS val
1088111398 ecr 0], length 0
23:44:24.168207 IP (tos 0x28, ttl 49, id 43802, offset 0, flags [none], proto UDP (17),
length 66)
166.84.136.80.16498 > 44.60.44.3.53: [udp sum ok] 5807+ PTR? 8.8.8.8.in-addr.arpa.
(38)
23:44:24.581889 IP (tos 0x0, ttl 63, id 32318, offset 0, flags [none], proto UDP (17),
length 90)
44.60.44.3.53 > 166.84.136.80.16498: [udp sum ok] 5807 q: PTR?
8.8.8.8.in-addr.arpa. 1/0/0 8.8.8.8.in-addr.arpa. [1d] PTR dns.google. (62)
23:44:46.903062 IP (tos 0x28, ttl 49, id 60097, offset 0, flags [none], proto UDP (17),
length 70)
166.84.136.80.30975 > 44.60.44.3.53: [udp sum ok] 2214+ PTR?
10.48.44.44.in-addr.arpa. (42)
23:44:46.910643 IP (tos 0x0, ttl 63, id 35006, offset 0, flags [none], proto UDP (17),
length 104)
44.60.44.3.53 > 166.84.136.80.30975: [udp sum ok] 2214* q: PTR?
10.48.44.44.in-addr.arpa. 1/0/0 10.48.44.44.in-addr.arpa. [5m] PTR
tops20.kz2x.ampr.org.
(76)
23:46:47.357818 IP (tos 0x28, ttl 49, id 20737, offset 0, flags [none], proto UDP (17),
length 70)
166.84.136.80.4397 > 44.60.44.3.53: [udp sum ok] 29354+ PTR?
10.48.44.44.in-addr.arpa. (42)
23:46:47.363206 IP (tos 0x0, ttl 63, id 49394, offset 0, flags [none], proto UDP (17),
length 104)
44.60.44.3.53 > 166.84.136.80.4397: [udp sum ok] 29354* q: PTR?
10.48.44.44.in-addr.arpa. 1/0/0 10.48.44.44.in-addr.arpa. [5m] PTR
tops20.kz2x.ampr.org.
(76)
23:47:05.189580 IP (tos 0x28, ttl 49, id 42467, offset 0, flags [none], proto UDP (17),
length 68)
166.84.136.80.17824 > 44.60.44.3.53: [udp sum ok] 64416+ PTR?
44.1.1.44.in-addr.arpa. (40)
23:47:05.197534 IP (tos 0x0, ttl 63, id 52687, offset 0, flags [none], proto UDP (17),
length 93)
44.60.44.3.53 > 166.84.136.80.17824: [udp sum ok] 64416* q: PTR?
44.1.1.44.in-addr.arpa. 1/0/0 44.1.1.44.in-addr.arpa. [1m] PTR
ns.ardc.net. (65)
23:47:13.291538 IP (tos 0x28, ttl 49, id 38510, offset 0, flags [none], proto UDP (17),
length 68)
166.84.136.80.8044 > 44.60.44.3.53: [udp sum ok] 48056+ PTR?
1.0.44.44.in-addr.arpa. (40)
23:47:13.296700 IP (tos 0x0, ttl 63, id 53664, offset 0, flags [none], proto UDP (17),
length 103)
44.60.44.3.53 > 166.84.136.80.8044: [udp sum ok] 48056* q: PTR?
1.0.44.44.in-addr.arpa. 1/0/0 1.0.44.44.in-addr.arpa. [5m] PTR
gw.hamgatema.ampr.org.
(75)
23:49:42.234256 IP (tos 0x28, ttl 49, id 6829, offset 0, flags [none], proto UDP (17),
length 70)
166.84.136.80.13216 > 44.60.44.3.53: [udp sum ok] 28034+ PTR?
1.21.182.44.in-addr.arpa. (42)
23:49:42.240120 IP (tos 0x0, ttl 63, id 5372, offset 0, flags [none], proto UDP (17),
length 98)
44.60.44.3.53 > 166.84.136.80.13216: [udp sum ok] 28034* q: PTR?
1.21.182.44.in-addr.arpa. 1/0/0 1.21.182.44.in-addr.arpa. [5m] PTR
yo2tm.ampr.org. (70)
On Sunday, May 5, 2024 at 07:50:48 PM EDT, Dan Cross <crossd(a)gmail.com> wrote:
On Sun, May 5, 2024 at 7:29 PM lleachii(a)aol.com <lleachii(a)aol.com> wrote:
I'm now running tcpdump on your SRC IP
I normally don't allow DNS queries from the Public - so I've now allowed your SRP
IP for testing (53/udp)
I agree it's odd only certain packets "upset" AMPRGW. Test now, and
I'll provide my results.
Thanks, Lynwood. The results largely mirror my own:
```
: gaja; host -4 44.44.48.1 44.60.44.3
;; connection timed out; no servers could be reached
: gaja; host -4 44.44.48.10 44.60.44.3
Using domain server:
Name: 44.60.44.3
Address: 44.60.44.3#53
Aliases:
10.48.44.44.in-addr.arpa domain name pointer
tops20.kz2x.ampr.org.
: gaja; host -4 44.1.1.44 44.60.44.3
Using domain server:
Name: 44.60.44.3
Address: 44.60.44.3#53
Aliases:
44.1.1.44.in-addr.arpa domain name pointer
ns.ardc.net.
: gaja; host -4 44.44.0.1 44.60.44.3
Using domain server:
Name: 44.60.44.3
Address: 44.60.44.3#53
Aliases:
1.0.44.44.in-addr.arpa domain name pointer
gw.hamgatema.ampr.org.
: gaja; host -4 44.44.48.9 44.60.44.3
;; connection timed out; no servers could be reached
: gaja;
```
(I sent a few other queries, as well.)
It sure seems like the packet contents are getting hashed somewhere,
and a few bits difference one way or the other is a big determinant of
what gets through.
- Dan C.
> On Sun, May 5, 2024 at 6:00 PM lleachii(a)aol.com <lleachii(a)aol.com> wrote:
> > Dan,
> >
> > Yes - the archives show we experienced packet loss in November 2020 and began
to discuss it on the reflector. The losses were mostly UDP.
>
> Interesting.
>
> What's curious about this (to me, anyway) is that it's not _just_
> packet loss, but (seemingly?) only loss of very specific packets. If
> the problem were just random UDP packet loss, I'd expect some of the
> queries that I observe to always succeed to occasionally fail, while
> some of those that always seem to fail would occasionally succeed.
> Instead, I see repeatable patterns of success and failure. Indeed, I
> can alternate queries between succeeding and failure and the results
> are very consistent.
>
> Curiously, in between my original message and now, some queries
> started working (consistently!) while others still consistently fail.
> Also, just as a data point, the note about UDP reminded me that I
> ought to test over TCP, and that works reliably for every case that I
> tried.
>
> > If I understand the issue you describe, any of us can help you test by running
tcpdump on our tunl0 interface to determine if we receive 'PTR?' packets thru
AMPRGW from your public IP, correct?
> >
> > tcpdump -vvvn -i tunl0 udp and port 53 and host 44.60.44.3 and host
xxx.xxx.xxx.xxx
> >
> > where xxx.xxx.xxx.xxx == a Public IP
>
> I think that's right; I've been running tests from 166.84.136.80. I
> just tried to send a couple of queries to `dns-mdc.ampr.org` but got
> no response.
>
> > I have seen various failures in the past.
> >
> > I have one theory. Some administrators blacklist the 44 space.
>
> If that were the case, I'd expect all success or all failure, or some
> random combination of success and failure.
>
> But instead, I see very specific success and failure cases, with no
> discernable pattern to what works and what doesn't. It doesn't appear
> to be totally random, nor is it an all-or-nothing sort of thing.
>
> I can think of a few possibilities/questions.
>
> 1. Are these packets even making it to the upstream side of AMPRGW? I
> have no way of knowing, really, and it's possible AMPRGW never sees
> them. But why just these packets containing these specific PTR
> queries? I suspect most of them arrive at UCSD upstream, but don't get
> passed.
> 2. Does AMPRGW block incoming DNS requests to non-authoritative
> servers? Or maybe there's an allowlist somewhere I'm not on? This
> seems unlikely; first, no one seems to know anything about that and
> I'd figure that they would if that sort of thing existed. Second, if
> that were the case, I'd suspect all DNS queries to fail, or at least
> all of a specific type (e.g., all `PTR?` queries or something like
> that). But that doesn't fit the observed data.
> 3. Could it be that AMPRGW is doing some kind of deep packet
> inspection and filtering queries that match certain, specific,
> filters? I find this hard to believe; from what I know of AMPRGW
> (which admittedly isn't THAT much, mostly reading between the lines of
> old posts by Brian Kantor), I just don't think it's that kind of
> system.
>
> What seems more likely to me is that there's something specific, but
> incidental, about the packets containing these queries that's tickling
> AMPRGW in just the right way that it's dropping them. I looked at the
> various stat files on
gw.ampr.org/private, but nothing sticks out to
> me as obvious.
>
> Thanks again for the response, Lynwood.
>
> - Dan C.
>
>
>
> > On Sunday, May 5, 2024 at 04:42:33 PM EDT, Dan Cross via 44net
<44net(a)mailman.ampr.org> wrote:
> >
> >
> > On Sun, May 5, 2024 at 4:38 PM Dave Gingrich <dave(a)dcg.us> wrote:
> >
> > They work fine using any public nameserver.
> >
> >
> > Sure! But that wasn’t the question. ;-) The question is about queries directed
to a specific server not transiting AMPRGW.
> >
> > - Dan C.
> >
> >
> > _______________________________________________
> > 44net mailing list -- 44net(a)mailman.ampr.org
> > To unsubscribe send an email to 44net-leave(a)mailman.ampr.org