Net:HTTP and timeout for hostname resolution?

Hi,

I’m looking for a reliable way how to fetch data from a http server,
guarantee it won’t exceed a specified time.

I did tried the most common solution (timeout 5 secs):

#____________________________________________________________________
require ‘net/http’

uri = URI(‘some.query.url/path?args’)
Net::HTTP.start(uri.host, uri.port, open_timeout: 5, read_timeout: 5,
ssl_timeout: 5) do
|http|
request = Net::HTTP::Get.new uri
response = http.request request
end
#____________________________________________________________________

Unfortunately, the given timeouts do not cover hostname resolution. If
the DNS is slower or stops communicating the above code won’t finish
sooner then after a minute or so.

I’ve tried to wrap the code in Timeout.timeout(5) { … } block but it
does not help, as the underlaying IO won’t allow triggering the timeout
exception.

Spawning a thread for unspecified time or use direct IP address of the
remote machine is not an option for this task.

It’s ok if the code won’t retrieve data in a given timeout, just need to
guarantee the attempt won’t last longer.

Any idea how to write it more bulletproof ?

It seems there is no direct way how to ensure timeout won’t be exceeded.

CRuby’s stock sockets library calls external C gethostbyname or
getaddrinfo for hostname resolution and that stops all Ruby code
including threads and prevents any context switching so you are out of
luck here.

There are possible solutions, however they cost something.

  1. You may use a non-blocking version of C’s getaddrinfo_a, but that’s
    unportable glibc specific solution and you’ll probably need to write
    your own wrapper as currently can’t find any existing gem that uses it.

  2. Use a pure Ruby’s DNS resolver like Net::DNS from net-dns gem which
    does not block other ruby threads. The downside it requires an extra gem
    to be installed as it’s relatively too complex to be just copy&pasted in
    your code. It would be also magnitude slower then C’s getaddrinfo_a,
    however more thorough benchmarking need to be done. It seems evident at
    resolving about two hundred domain names.