KB Article #182909

Five second delay observed during policies

Problem

A delay of exactly five seconds is seen during a policy.

Resolution

This normally indicates some kind of DNS problem, because the default timeout for DNS on Linux is five seconds. This can also happen for reverse-DNS lookups, especially when using attributes like http.request.hostname or using %h (remote hostname) in the Transaction Access Log format which are normally generated lazily.


While this problem can be bypassed by using IPs instead of hostnames, using http.request.ip as a whiteboard variable or %a (remote IP) in Transaction Access Log formats, the better fix is to ensure that the system's DNS is working correctly and has redundancy. Also note that you must ensure that both A and AAAA lookups get responses, because getaddrinfo(), when called with no address family specified, will wait for answers relating to all supported families before returning or timing out. This can lead to non-obvious problems with firewall rules when queries are only half working.


Beware of trying to fix this by merely lowering the system's DNS timeouts or using options rotate when inaccessible DNS servers might still be in the config. A one second delay in your transactions isn't much better than five seconds in most cases, and options rotate simply does round robin over the list. That means it will not stop sending requests to an unresponsive DNS server if there is one in the list, DNS traffic will still go to an inaccessible DNS server whenever that server's turn comes up. Better solutions require redundancy at the DNS level by having multiple servers that can respond via anycast routing, running local DNS resolvers, or similar measures that ensure that queries are not timing out in the first place.