# Summary
Given that DNS has been a popular topic of discussion lately, I
thought it might be useful to try and put some words about it on the
Wiki. In preparation for this, I was playing around with the recursive
DNS server on my subnet, and I ran into some strange behavior that I
don't understand; perhaps someone else knows what's going on? To my
eye, it looks like some types of PTR queries are being dropped at or
before AMPRGW; this behavior is repeatable.
# Setup and Context
* I have a machine on my allocated subnet (44.44.48.29) that is
running a recursive, caching DNS server (that is, a resolver).
* The resolver is configured with stub zones that forward requests for
ampr.org, 44.in-addr.arpa and 128.44.in-addr.arpa to the authoritative
servers for those zones listed on the Wiki
* The resolver is configured to respond to requests issued over both
IPv4 and IPv6
* Machines on my subnet are configured to query this server.
* I have configured my firewall to allow access to the `domain` port
(e.g., UDP/53) on that machine from a few places on the Internet ---
but not all.
# Observations
What I observe is inconsistent, but repeatable, failures when issuing
PTR queries for certain IP addresses via IPv4 from the external
Internet.
* Forward queries from all sources seem to work fine. At least, I
haven't observed any that fail (other than for, say, non-existent
hostnames).
* Similarly, PTR queries issued from the hosts on my subnet all seem
to work as expected.
* PTR queries issued over IPv6 seem to work fine from any source.
* PTR queries issued over IPv4 from other networks locally connected
to my home network (I have several ethernets and TCP/IP networks at
home, interconnected with a set of routers and switches) all seem to
work fine.
* PTR queries for _some_ IP addresses issued over IPv4 from external
machines on the Internet work reliably. For example, I can reliably
query for 44.1.1.44, 44.0.0.1, 127.0.0.1, and 8.8.8.8 and I don't
believe I've ever seen any of these queries fail.
* PTR queries for _most_ 44Net IP addresses issued over IPv4 from
external machines on the Internet fail reliably. For example, I
cannot query for 44.44.48.1 (my gateway), or 44.182.21.1 (YO2LOJ's
machine). I don't believe I have ever seen any of these queries
succeed.
Note that the only thing that differs in the last two cases is the
address I'm trying to resolve.
Now the interesting part. For the failing queries, I've observed that
the actual query traffic _never makes it to my gateway from UCSD_.
That is, the encapsulated datagram carrying the UDP packet containing
the query never makes it to the external gateway.
I issued these queries with the `host` tool.
To verify this, I ran `tcpdump` in several places:
1. At the source machine where I issued the DNS queries:
tcpdump -vvn host 44.44.48.29
2. On my gateway, examining the encapsulated traffic on the external interface:
tcpdump -vvni cnmac1 ip proto 4 and 'ip[29:1] == 17 and (ip[42:2]
== 53 or ip[40:2] == 53)'
(Note the `ip[...]` expressions are matching in the encapsulated
IP and UDP headers.)
3. On the DNS server machine itself:
tcpdump -vvn udp port domain or tcp port domain
To verify that everything is working as expected, first I issue a
successful forward query:
```
: gaja; host -4
kz2x.ampr.org srv.kz2x.ampr.org
Using domain server:
Name:
srv.kz2x.ampr.org
Address: 44.44.48.29#53
Aliases:
kz2x.ampr.org has address 44.44.48.2
kz2x.ampr.org has IPv6 address 2603:3005:b04:8144:48:edff:fe9c:c00
: gaja;
```
Note that the query succeeded (the astute reader may notice that
that's missing an MX record, but ignore that for now), but let's look
at the `tcpdump` output at each stage just to see what it looks like.
Note that the data is in my cache, so I omit the details of recursive
queries down to the roots, etc.
From the local machine where I issue the query, one sees:
```
20:07:31.444583 166.84.136.80.35777 > 44.44.48.29.53: [udp sum ok]
32707+ A? kz2x.ampr.org.(31) (ttl 64, id 18835, len 59)
20:07:31.603276 44.44.48.29.53 > 166.84.136.80.35777: [udp sum ok]
32707* q: A?
kz2x.ampr.org. 1/0/0
kz2x.ampr.org. A 44.44.48.2(47) (ttl
40, id 12176, len 75)
20:07:31.605932 166.84.136.80.22717 > 44.44.48.29.53: [udp sum ok]
7616+ AAAA? kz2x.ampr.org.(31) (ttl 64, id 58081, len 59)
20:07:31.761378 44.44.48.29.53 > 166.84.136.80.22717: [udp sum ok]
7616* q: AAAA?
kz2x.ampr.org. 1/0/0
kz2x.ampr.org. AAAA
2603:3005:b04:8144:48:edff:fe9c:c00(59) (ttl 40, id 62606, len 87)
20:07:31.762560 166.84.136.80.28977 > 44.44.48.29.53: [udp sum ok]
58832+ MX? kz2x.ampr.org.(31) (ttl 64, id 41619, len 59)
20:07:31.911483 44.44.48.29.53 > 166.84.136.80.28977: 58832* q: MX?
kz2x.ampr.org. 0/1/0 ns:
kz2x.ampr.org. SOA[|domain] (ttl 40, id
37627, len 110)
```
From the external interface on my AMPRNet gateway machine, I see:
20:07:31.523609 169.228.34.84 > 23.30.150.141: 166.84.136.80.35777 >
44.44.48.29.53: [udp sum ok] 32707+ A? kz2x.ampr.org.(31) [tos 0x28]
(ttl 49, id 18835, len 59) (ttl 48, id 486, len 79)
20:07:31.524290 23.30.150.141 > 169.228.34.84: 44.44.48.29.53 >
166.84.136.80.35777: [udp sum ok] 32707* q: A?
kz2x.ampr.org. 1/0/0
kz2x.ampr.org. A 44.44.48.2(47) (ttl 63, id 12176, len 75) (ttl 64, id
12041, len 95)
20:07:31.683608 169.228.34.84 > 23.30.150.141: 166.84.136.80.22717 >
44.44.48.29.53: [udp sum ok] 7616+ AAAA? kz2x.ampr.org.(31) [tos 0x28]
(ttl 49, id 58081, len 59) (ttl 48, id 6515, len 79)
20:07:31.684262 23.30.150.141 > 169.228.34.84: 44.44.48.29.53 >
166.84.136.80.22717: 7616* q: AAAA?
kz2x.ampr.org. 1/0/0
kz2x.ampr.org. AAAA[|domain] (ttl 63, id 62606, len 87) (ttl 64, id
48275, len 107)
20:07:31.836445 169.228.34.84 > 23.30.150.141: 166.84.136.80.28977 >
44.44.48.29.53: [udp sum ok] 58832+ MX? kz2x.ampr.org.(31) [tos 0x28]
(ttl 49, id 41619, len 59) (ttl 48, id 6629, len 79)
20:07:31.837067 23.30.150.141 > 169.228.34.84: 44.44.48.29.53 >
166.84.136.80.28977: 58832* q: MX?
kz2x.ampr.org. 0/1/0 ns:
kz2x.ampr.org. SOA[|domain] (ttl 63, id 37627, len 110) (ttl 64, id
12165, len 130)
And on the DNS server machine itself:
```
20:07:31.520610 166.84.136.80.35777 > 44.44.48.29.53: [udp sum ok]
32707+ A? kz2x.ampr.org.(31) [tos 0x28] (ttl 48, id 44358, len 59)
20:07:31.520796 44.44.48.29.53 > 166.84.136.80.35777: [udp sum ok]
32707* q: A?
kz2x.ampr.org. 1/0/0
kz2x.ampr.org. A 44.44.48.2(47) (ttl
64, id 33360, len 75)
20:07:31.680599 166.84.136.80.22717 > 44.44.48.29.53: [udp sum ok]
7616+ AAAA? kz2x.ampr.org.(31) [tos 0x28] (ttl 48, id 23293, len 59)
20:07:31.680766 44.44.48.29.53 > 166.84.136.80.22717: [udp sum ok]
7616* q: AAAA?
kz2x.ampr.org. 1/0/0
kz2x.ampr.org. AAAA
2603:3005:b04:8144:48:edff:fe9c:c00(59) (ttl 64, id 31261, len 87)
20:07:31.833404 166.84.136.80.28977 > 44.44.48.29.53: [udp sum ok]
58832+ MX? kz2x.ampr.org.(31) [tos 0x28] (ttl 48, id 44916, len 59)
20:07:31.833578 44.44.48.29.53 > 166.84.136.80.28977: 58832* q: MX?
kz2x.ampr.org. 0/1/0 ns:
kz2x.ampr.org. SOA[|domain] (ttl 64, id
28444, len 110)
```
These results are all expected. Similarly for a successful reverse query:
```
: gaja; host -4 44.1.1.44
srv.kz2x.ampr.org
Using domain server:
Name:
srv.kz2x.ampr.org
Address: 44.44.48.29#53
Aliases:
44.1.1.44.in-addr.arpa domain name pointer
ns.ardc.net.
: gaja;
```
As seen from the source host:
```
20:13:37.445021 166.84.136.80.4339 > 44.44.48.29.53: [udp sum ok]
61470+ PTR? 44.1.1.44.in-addr.arpa.(40) (ttl 64, id 34902, len 68)
20:13:37.609712 44.44.48.29.53 > 166.84.136.80.4339: [udp sum ok]
61470 q: PTR? 44.1.1.44.in-addr.arpa. 1/0/0 44.1.1.44.in-addr.arpa.
PTR ns.ardc.net.(65) (ttl 40, id 15635, len 93)
```
From the gateway:
```
20:13:37.523609 169.228.34.84 > 23.30.150.141: 166.84.136.80.4339 >
44.44.48.29.53: [udp sum ok] 61470+ PTR? 44.1.1.44.in-addr.arpa.(40)
[tos 0x28] (ttl 49, id 34902, len 68) (ttl 48, id 13407, len 88)
20:13:37.533691 23.30.150.141 > 169.228.34.84: 44.44.48.29.53 >
166.84.136.80.4339: 61470 q: PTR? 44.1.1.44.in-addr.arpa. 1/0/0
44.1.1.44.in-addr.arpa. PTR[|domain] (ttl 63, id 15635, len 93) (ttl
64, id 52552, len 113)
```
And at the DNS server:
```
20:13:37.520500 166.84.136.80.4339 > 44.44.48.29.53: [udp sum ok]
61470+ PTR? 44.1.1.44.in-addr.arpa.(40) [tos 0x28] (ttl 48, id 22620,
len 68)
20:13:37.520691 44.44.48.29.53 > 166.84.136.80.4339: [udp sum ok]
61470 q: PTR? 44.1.1.44.in-addr.arpa. 1/0/0 44.1.1.44.in-addr.arpa.
PTR ns.ardc.net.(65) (ttl 64, id 59842, len 93)
```
Okay, but what about a failing query? Let's try one:
```
: gaja; host -4 44.44.48.29
srv.kz2x.ampr.org
;; connection timed out; no servers could be reached
: gaja;
```
Uh oh. As seen from the source host:
```
20:15:26.775880 166.84.136.80.29535 > 44.44.48.29.53: [udp sum ok]
39502+ PTR? 29.48.44.44.in-addr.arpa.(42) (ttl 64, id 59475, len 70)
20:15:27.776870 166.84.136.80.29535 > 44.44.48.29.53: [udp sum ok]
39502+ PTR? 29.48.44.44.in-addr.arpa.(42) (ttl 64, id 40687, len 70)
```
But at the gateway, nothing at all is seen; no relevant data is
received from the AMPRGW, and of course, similarly at the DNS server
as well.
So it appears that these queries are being lost, either at AMPRGW or
before. Has anyone seen this before? Is it expected?
- Dan C.