On 2014-08-06 21:31, Rob Janssen wrote:
[..]
In general: please read the entire mail before
starting to comment on
individual paragraphs,
so you don't need to ask questions that are answered a few paragraphs
down the same mail.
(Ehmm, which exact questions that I supposedly made where answered "a
few paragraphs down"!? ;)
The system is running Debian Wheezy, all uptodate,
with bind9 version
9.8.4 plus Debian patches.
Do you mean "I am running the latest Debian stable/testing/unstable" or
do you mean "I took the patches from Debian and compiled them together"?
It is not available to the outside no so security
worries.
What exactly do you mean with that? How did you make it "not available"?
Which "outside" has no access to it?
ampr.org NSs are on the wide
Internet, hence something has to be able to send it packets.
Did you maybe simply mean that it is non-recursive for non-local clients?
If it is really "not available to the outside" then that might explain
your resolution issues.
Are you allowing both UDP and TCP port 53 replies for instance?
Note that as the replies have to come in, if there is a bad path
What I see is that it does not cache
ampr.org
addresses very long, but
that does not surprise me
because the default TTL in the zone is only one hour. Of course
everything would perform better
when the TTL was the more usual 24hours, but undoubtedly there was a
good reason to set this TTL.
Actually, lots of zones have short TTLs on labels so that those hosts
can be changed quickly. Typically one does set somewhat longer TTLs on
the NS hosts though.
(lately it was useful for me as I changed the external
address of the
machine and the update
was propagated quickly in DNS, but in general I would think the zone is
very static)
It is very common, do check mass-hosted services like Google, Facebook,
Akamai etc. All have nice low TTls as they want to see your queries and
be able to change them a lot.
What I am surprised about is that the measured
relative performance of
the 7 alternative DNS
servers is apparently not kept by bind long enough to be useful. The TTL
at that level is 24
hours but I think I have often seen that when doing the same lookups
within 24 hours I see lookup
delays again. The statistics command you gave does not provide that
info, I wonder if there is
some bind command to query its measured timers and preferred servers.
You can always use: "rndc dumpdb" to get the current database.
or just query and check the TTL that is left: dig @<ns> <hostname>
We have several timeservers on net-44 addresses and I
do "ntpq -p -c rv
-c mrulist" a couple of
times a day now that we are testing and deploying. It was slow every
time, of course the cached
lookups are gone because the previous try was more than an hour ago, but
apparently the DNS
preference info was gone too and queries were again sent to slow (for
me) servers.
When running such a query that you expect that is slow, run a tcpdump in
the background or anther shell, then you can see what is being queried
and takes so long. Wireshark should visualize this easily in
conversation view.
After my experimental change with the hardwired
forwarders everything
works much better.
I'm not sure I want to keep it, but it certainly indicates that there
*is* a way in which bind could
handle it more efficiently.
You just hard-coded them, thus ignoring any kind of TTL. That cannot be
done internet-scale, then you could as well just go back to /etc/hosts.
Note that you are also avoiding the actual lookup of the NS record, lots
of baby steps.
[..] Probably I should turn off the DNSSEC that has
been enabled by default by bind and Debian, that appears to cause a lot
of extra overhead too.
As asked above, did you filter out TCP for DNS?
EDNS0 quite needs it and DNSSEC needs EDNS0 due to big responses.
tcpdump/wireshark as mentioned above is your best way to debug...
Greets,
Jeroen