Slow DNS lookups can have many reasons. Mostly they are easy to fix because it simply is a wrong IP address of the DNS f.ex. But today i had a harder one, but was easy to fix if you know how…
Looking up an address with dig works all the time, but when puppet gets its plugins or other files from the puppet master, it takes five second for each file, and puppet fails with an “execution expired” error. So I started to debug it.
It’s most likely not an error within puppet, because i know that our CentOS 5 clients are working just fine.
Next, tcpdump. Took five seconds until i saw a connection on port 8140. As expected, not an error within puppet. Next, DNS. Now it’s getting interesting.
It sends out two DNS request, one for the A record and one for the AAAA record. This is the default since glbc 2.9 to do them in parallel. But i only get back one answer waiting for the second. Looks like a problem on your DNS proxy which sits on our firewall or the DNS behind. Not sure yet.
The man page of resolv.conf has two options which could fix the problem:
single-request (since glibc 2.10)
sets RES_SNGLKUP in _res.options. By default, glibc
performs IPv4 and IPv6 lookups in parallel since
version 2.9. <strong>Some appliance DNS servers cannot handle
these queries properly and make the requests time out.</strong>
This option disables the behavior and makes glibc
perform the IPv6 and IPv4 requests sequentially (at the
cost of some slowdown of the resolving process).
single-request-reopen (since glibc 2.9)
The resolver uses the same socket for the A and AAAA
requests. <strong>Some hardware mistakenly sends back only one
reply.When that happens the client system will sit
and wait for the second reply.</strong> Turning this option on
changes this behavior so that if two requests from the
same port are not handled correctly it will close the
socket and open a new one before sending the second
The second option sounds exactly like my problem. Added “options single-request-reopen” to resolv.conf, and bang… Works!