On 2014-08-06 10:07, Rob Janssen wrote:
(Please trim inclusions from previous messages) _______________________________________________ More tracing reveals why the times are sometimes as high as I reported...
It first got my attention because commands like "netstat -t" and "ntpq -p" took quite some time to execute when 44.x.x.x addresses were in the listing, stopping for each reverse lookup. I had also noticed that "ping callsign.ampr.org" always takes some time before it sends the first ping. So I decided to investigate.
I used this command to time it: time host callsign.ampr.org
It appears that when the callsign exists, this does two lookups, for the A and MX record.
There is a big reason why people who know DNS always state to use the 'dig' command for any kind of diagnostics....
When the callsign does not exist (I also tested that), it also does two lookups: the specified one, and the one with .ampr.org appended a second time. This is caused by "search ampr.org" in /etc/resolv.conf. It was put there by the Linux installer, I know many Linux distributors do that. (you specifiy a hostname and domainname during install, and it will put the domainname in /etc/resolv.conf as a search line) That explains the long execution time of "host" that could not be explained by a single lookup.
That explains the DOUBLE execution time.
It does not explain why you have LONG execution (well, lookup) times.
I think it is not very clever of the resolver library to append a search domain that already was in the query, but maybe there are some cases where this is useful?
File a redundant bug with glibc, maybe they ever fix this ;)
Anyway, I removed that from resolv.conf as this machine is not for end-users anyway, and I don't mind typing full domain names. It doubles the speed of all lookups of nonexisting names, and reduces the load on the ampr.org nameservers as well.
It is maybe a better idea to use a more local caching nameserver.
This doubles the speed of the host command, but of course the reverse lookups are not affected. When using those two local DNS servers first, they are fast. Bind should learn and remember that, I think.
If you are using BIND as a recursive server (check where /etc/resolv.conf points to) then it will indeed cache a latency for a upstream nameserver for a while and use the best one.
BIND (unless configured in forwarding mode) always uses the roots to get to the final domain.
It is only good that ampr.org DNS servers are around the globe, and a pity that bind does not take that to an advantage.
BIND only has a history of previous requests, it does not know about variances in network conditions or if hosts go down or not.
It does do that for forwarders, at least that is written in the docs. (this machine runs bind as a caching resolver and the nameserver in resolv.conf is 127.0.0.1)
How many users are using your cache? Unless other folks query the same names, the caching has little effect.
Also with short TTLs little caching can be done as the records will expire quickly.
You might want to enable stats and check them. Eg as per: http://supportex.net/2011/11/bind-statistics-information/
Also, do make sure you are operating current versions of software, especially BIND has quite some security issues. Above all of course do also make sure it is not an open recursor for the world...
Greets, Jeroen