# Summary
Given that DNS has been a popular topic of discussion lately, I thought it might be useful to try and put some words about it on the Wiki. In preparation for this, I was playing around with the recursive DNS server on my subnet, and I ran into some strange behavior that I don't understand; perhaps someone else knows what's going on? To my eye, it looks like some types of PTR queries are being dropped at or before AMPRGW; this behavior is repeatable.
# Setup and Context
* I have a machine on my allocated subnet (44.44.48.29) that is running a recursive, caching DNS server (that is, a resolver). * The resolver is configured with stub zones that forward requests for ampr.org, 44.in-addr.arpa and 128.44.in-addr.arpa to the authoritative servers for those zones listed on the Wiki * The resolver is configured to respond to requests issued over both IPv4 and IPv6 * Machines on my subnet are configured to query this server. * I have configured my firewall to allow access to the `domain` port (e.g., UDP/53) on that machine from a few places on the Internet --- but not all.
# Observations
What I observe is inconsistent, but repeatable, failures when issuing PTR queries for certain IP addresses via IPv4 from the external Internet.
* Forward queries from all sources seem to work fine. At least, I haven't observed any that fail (other than for, say, non-existent hostnames). * Similarly, PTR queries issued from the hosts on my subnet all seem to work as expected. * PTR queries issued over IPv6 seem to work fine from any source. * PTR queries issued over IPv4 from other networks locally connected to my home network (I have several ethernets and TCP/IP networks at home, interconnected with a set of routers and switches) all seem to work fine. * PTR queries for _some_ IP addresses issued over IPv4 from external machines on the Internet work reliably. For example, I can reliably query for 44.1.1.44, 44.0.0.1, 127.0.0.1, and 8.8.8.8 and I don't believe I've ever seen any of these queries fail. * PTR queries for _most_ 44Net IP addresses issued over IPv4 from external machines on the Internet fail reliably. For example, I cannot query for 44.44.48.1 (my gateway), or 44.182.21.1 (YO2LOJ's machine). I don't believe I have ever seen any of these queries succeed.
Note that the only thing that differs in the last two cases is the address I'm trying to resolve.
Now the interesting part. For the failing queries, I've observed that the actual query traffic _never makes it to my gateway from UCSD_. That is, the encapsulated datagram carrying the UDP packet containing the query never makes it to the external gateway.
I issued these queries with the `host` tool.
To verify this, I ran `tcpdump` in several places: 1. At the source machine where I issued the DNS queries: tcpdump -vvn host 44.44.48.29 2. On my gateway, examining the encapsulated traffic on the external interface: tcpdump -vvni cnmac1 ip proto 4 and 'ip[29:1] == 17 and (ip[42:2] == 53 or ip[40:2] == 53)' (Note the `ip[...]` expressions are matching in the encapsulated IP and UDP headers.) 3. On the DNS server machine itself: tcpdump -vvn udp port domain or tcp port domain
To verify that everything is working as expected, first I issue a successful forward query:
``` : gaja; host -4 kz2x.ampr.org srv.kz2x.ampr.org Using domain server: Name: srv.kz2x.ampr.org Address: 44.44.48.29#53 Aliases:
kz2x.ampr.org has address 44.44.48.2 kz2x.ampr.org has IPv6 address 2603:3005:b04:8144:48:edff:fe9c:c00 : gaja; ```
Note that the query succeeded (the astute reader may notice that that's missing an MX record, but ignore that for now), but let's look at the `tcpdump` output at each stage just to see what it looks like. Note that the data is in my cache, so I omit the details of recursive queries down to the roots, etc.
From the local machine where I issue the query, one sees:
``` 20:07:31.444583 166.84.136.80.35777 > 44.44.48.29.53: [udp sum ok] 32707+ A? kz2x.ampr.org.(31) (ttl 64, id 18835, len 59) 20:07:31.603276 44.44.48.29.53 > 166.84.136.80.35777: [udp sum ok] 32707* q: A? kz2x.ampr.org. 1/0/0 kz2x.ampr.org. A 44.44.48.2(47) (ttl 40, id 12176, len 75) 20:07:31.605932 166.84.136.80.22717 > 44.44.48.29.53: [udp sum ok] 7616+ AAAA? kz2x.ampr.org.(31) (ttl 64, id 58081, len 59) 20:07:31.761378 44.44.48.29.53 > 166.84.136.80.22717: [udp sum ok] 7616* q: AAAA? kz2x.ampr.org. 1/0/0 kz2x.ampr.org. AAAA 2603:3005:b04:8144:48:edff:fe9c:c00(59) (ttl 40, id 62606, len 87) 20:07:31.762560 166.84.136.80.28977 > 44.44.48.29.53: [udp sum ok] 58832+ MX? kz2x.ampr.org.(31) (ttl 64, id 41619, len 59) 20:07:31.911483 44.44.48.29.53 > 166.84.136.80.28977: 58832* q: MX? kz2x.ampr.org. 0/1/0 ns: kz2x.ampr.org. SOA[|domain] (ttl 40, id 37627, len 110) ```
From the external interface on my AMPRNet gateway machine, I see:
20:07:31.523609 169.228.34.84 > 23.30.150.141: 166.84.136.80.35777 > 44.44.48.29.53: [udp sum ok] 32707+ A? kz2x.ampr.org.(31) [tos 0x28] (ttl 49, id 18835, len 59) (ttl 48, id 486, len 79) 20:07:31.524290 23.30.150.141 > 169.228.34.84: 44.44.48.29.53 > 166.84.136.80.35777: [udp sum ok] 32707* q: A? kz2x.ampr.org. 1/0/0 kz2x.ampr.org. A 44.44.48.2(47) (ttl 63, id 12176, len 75) (ttl 64, id 12041, len 95) 20:07:31.683608 169.228.34.84 > 23.30.150.141: 166.84.136.80.22717 > 44.44.48.29.53: [udp sum ok] 7616+ AAAA? kz2x.ampr.org.(31) [tos 0x28] (ttl 49, id 58081, len 59) (ttl 48, id 6515, len 79) 20:07:31.684262 23.30.150.141 > 169.228.34.84: 44.44.48.29.53 > 166.84.136.80.22717: 7616* q: AAAA? kz2x.ampr.org. 1/0/0 kz2x.ampr.org. AAAA[|domain] (ttl 63, id 62606, len 87) (ttl 64, id 48275, len 107) 20:07:31.836445 169.228.34.84 > 23.30.150.141: 166.84.136.80.28977 > 44.44.48.29.53: [udp sum ok] 58832+ MX? kz2x.ampr.org.(31) [tos 0x28] (ttl 49, id 41619, len 59) (ttl 48, id 6629, len 79) 20:07:31.837067 23.30.150.141 > 169.228.34.84: 44.44.48.29.53 > 166.84.136.80.28977: 58832* q: MX? kz2x.ampr.org. 0/1/0 ns: kz2x.ampr.org. SOA[|domain] (ttl 63, id 37627, len 110) (ttl 64, id 12165, len 130)
And on the DNS server machine itself:
``` 20:07:31.520610 166.84.136.80.35777 > 44.44.48.29.53: [udp sum ok] 32707+ A? kz2x.ampr.org.(31) [tos 0x28] (ttl 48, id 44358, len 59) 20:07:31.520796 44.44.48.29.53 > 166.84.136.80.35777: [udp sum ok] 32707* q: A? kz2x.ampr.org. 1/0/0 kz2x.ampr.org. A 44.44.48.2(47) (ttl 64, id 33360, len 75) 20:07:31.680599 166.84.136.80.22717 > 44.44.48.29.53: [udp sum ok] 7616+ AAAA? kz2x.ampr.org.(31) [tos 0x28] (ttl 48, id 23293, len 59) 20:07:31.680766 44.44.48.29.53 > 166.84.136.80.22717: [udp sum ok] 7616* q: AAAA? kz2x.ampr.org. 1/0/0 kz2x.ampr.org. AAAA 2603:3005:b04:8144:48:edff:fe9c:c00(59) (ttl 64, id 31261, len 87) 20:07:31.833404 166.84.136.80.28977 > 44.44.48.29.53: [udp sum ok] 58832+ MX? kz2x.ampr.org.(31) [tos 0x28] (ttl 48, id 44916, len 59) 20:07:31.833578 44.44.48.29.53 > 166.84.136.80.28977: 58832* q: MX? kz2x.ampr.org. 0/1/0 ns: kz2x.ampr.org. SOA[|domain] (ttl 64, id 28444, len 110) ```
These results are all expected. Similarly for a successful reverse query:
``` : gaja; host -4 44.1.1.44 srv.kz2x.ampr.org Using domain server: Name: srv.kz2x.ampr.org Address: 44.44.48.29#53 Aliases:
44.1.1.44.in-addr.arpa domain name pointer ns.ardc.net. : gaja; ```
As seen from the source host:
``` 20:13:37.445021 166.84.136.80.4339 > 44.44.48.29.53: [udp sum ok] 61470+ PTR? 44.1.1.44.in-addr.arpa.(40) (ttl 64, id 34902, len 68) 20:13:37.609712 44.44.48.29.53 > 166.84.136.80.4339: [udp sum ok] 61470 q: PTR? 44.1.1.44.in-addr.arpa. 1/0/0 44.1.1.44.in-addr.arpa. PTR ns.ardc.net.(65) (ttl 40, id 15635, len 93) ```
From the gateway:
``` 20:13:37.523609 169.228.34.84 > 23.30.150.141: 166.84.136.80.4339 > 44.44.48.29.53: [udp sum ok] 61470+ PTR? 44.1.1.44.in-addr.arpa.(40) [tos 0x28] (ttl 49, id 34902, len 68) (ttl 48, id 13407, len 88) 20:13:37.533691 23.30.150.141 > 169.228.34.84: 44.44.48.29.53 > 166.84.136.80.4339: 61470 q: PTR? 44.1.1.44.in-addr.arpa. 1/0/0 44.1.1.44.in-addr.arpa. PTR[|domain] (ttl 63, id 15635, len 93) (ttl 64, id 52552, len 113) ```
And at the DNS server:
``` 20:13:37.520500 166.84.136.80.4339 > 44.44.48.29.53: [udp sum ok] 61470+ PTR? 44.1.1.44.in-addr.arpa.(40) [tos 0x28] (ttl 48, id 22620, len 68) 20:13:37.520691 44.44.48.29.53 > 166.84.136.80.4339: [udp sum ok] 61470 q: PTR? 44.1.1.44.in-addr.arpa. 1/0/0 44.1.1.44.in-addr.arpa. PTR ns.ardc.net.(65) (ttl 64, id 59842, len 93) ```
Okay, but what about a failing query? Let's try one:
``` : gaja; host -4 44.44.48.29 srv.kz2x.ampr.org ;; connection timed out; no servers could be reached : gaja; ```
Uh oh. As seen from the source host:
``` 20:15:26.775880 166.84.136.80.29535 > 44.44.48.29.53: [udp sum ok] 39502+ PTR? 29.48.44.44.in-addr.arpa.(42) (ttl 64, id 59475, len 70) 20:15:27.776870 166.84.136.80.29535 > 44.44.48.29.53: [udp sum ok] 39502+ PTR? 29.48.44.44.in-addr.arpa.(42) (ttl 64, id 40687, len 70) ```
But at the gateway, nothing at all is seen; no relevant data is received from the AMPRGW, and of course, similarly at the DNS server as well.
So it appears that these queries are being lost, either at AMPRGW or before. Has anyone seen this before? Is it expected?
- Dan C.