Resolver does not re-resolve upstream servers after initial cache

aris · November 7, 2012, 9:40pm

Using nginx 1.2.3-stable on Ubuntu 12.04 I have the following config:

http {
  resolver 172.16.0.23 valid=300s;
  resolver_timeout 10s;

  upstream myupstream {
    server example.com;
  }

  server {
    listen 80 default_server;

    location / {
      proxy_pass http://myupstream$request_uri;
      proxy_pass_request_headers on;
      proxy_set_header Host $host;
    }
  }
}

As I understand it, without the resolver config, nginx will resolve
example.com’s IP once on load and cache it until it stops or fully
reloads the config.

With the resolver config above, nginx should re-resolve the IP every
5mins.

However, this is not happening: I can watch tcpdump -n udp port 53 but I
see no re-resolution taking place.

I’d love to know how to fix this. Any advice appreciated thanks!

davenolan · November 8, 2012, 8:34am

On Wed, Nov 07, 2012 at 09:40:49PM +0100, Dave Nolan wrote:

  server {
As I understand it, without the resolver config, nginx will resolve
example.com’s IP once on load and cache it until it stops or fully
reloads the config.

With the resolver config above, nginx should re-resolve the IP every
5mins.

This is not the way how it works.

A run-time resolving only takes place if URL specified in “proxy_pass”
contains variables, AND the resulting server name doesn’t match any of
the configured server groups (using the “upstream” directives). This
is documented here: Module ngx_http_proxy_module

In your case, the server name is always “myupstream” and since it
matches “upstream myupstream”, no run-time resolving takes place.

However, this is not happening: I can watch tcpdump -n udp port 53 but I
see no re-resolution taking place.

I’d love to know how to fix this. Any advice appreciated thanks!

proxy_pass http://example.com$request_uri;

will resolve “example.com” dynamically (assuming of course there’s
no “upstream example.com” in configuration).

davenolan · November 8, 2012, 11:38am

Ruslan E. wrote in post #1083512:

On Wed, Nov 07, 2012 at 09:40:49PM +0100, Dave Nolan wrote:
  server {
As I understand it, without the resolver config, nginx will resolve
example.com’s IP once on load and cache it until it stops or fully
reloads the config.

With the resolver config above, nginx should re-resolve the IP every
5mins.
This is not the way how it works.

A run-time resolving only takes place if URL specified in “proxy_pass”
contains variables, AND the resulting server name doesn’t match any of
the configured server groups (using the “upstream” directives). This
is documented here: Module ngx_http_proxy_module

In your case, the server name is always “myupstream” and since it
matches “upstream myupstream”, no run-time resolving takes place.

What’s the reason behind this? It feels like that, even if proxy_pass
defers to the server group, resolver config should be respected for
servers defined within the group.

However, this is not happening: I can watch tcpdump -n udp port 53 but I
see no re-resolution taking place.

I’d love to know how to fix this. Any advice appreciated thanks!

proxy_pass http://example.com$request_uri;

will resolve “example.com” dynamically (assuming of course there’s
no “upstream example.com” in configuration).

Thanks very much for your help.

If I switch to using example.com directly in the proxy_pass, I lose the
flexibility of server groups. Is there any way of dynamically
re-resolving servers in upstream server group?

davenolan · November 8, 2012, 12:07pm

Dave Nolan Wrote:

Thanks very much for your help.

If I switch to using example.com directly in the proxy_pass, I lose
the
flexibility of server groups. Is there any way of dynamically
re-resolving servers in upstream server group?

Hi,

I can add that I lost my production servers last night because of this
behavior.

I use dynamic dns name for flexibility for almost all my servers
I put one backend server to maintenance so the name was removed by dns
(after a TTL)
corosync manage my nginx servers… and can restart them.

You can easily understand what append :
corosync detect a problem, fail back to another server, restart nginx
but
nginx can’t resolved a backend host in upstream so it failed to start
(with
“[emerg] host not found in upstream”).

All my nginx servers have been down because of this.

Just like you, I can’t remove my server groups but I want the
flexibility of
DNS resolving (Not failing at start and TTL).

–
Guilhem Lettron
Youscribe - www.youscribe.com

Posted at Nginx Forum:

davenolan · November 8, 2012, 1:06pm

On 8 Nov2012, at 15:07 , guilhem [email protected] wrote:

All my nginx servers have been down because of this.

Just like you, I can’t remove my server groups but I want the flexibility of
DNS resolving (Not failing at start and TTL).

If you want the flexibility of DNS resolving and safeguard yourself
against
DNS failure you should either add hostnames to /etc/hosts or start
local named/NSD/etc with appropriate slave zones.

davenolan · November 9, 2012, 10:14am

Sergey B. wrote in post #1083548:

On 8 Nov2012, at 15:07 , guilhem [email protected] wrote:

All my nginx servers have been down because of this.

Just like you, I can’t remove my server groups but I want the flexibility of
DNS resolving (Not failing at start and TTL).

If you want the flexibility of DNS resolving and safeguard yourself
against
DNS failure you should either add hostnames to /etc/hosts or start
local named/NSD/etc with appropriate slave zones.

Sure, that kind of flexibility needs more tools than just nginx.

But actually it’s a question about consistency, right?

Even if proxy_pass defers to the server group, resolver config should be
respected for servers defined within the group, just like for everything
else. I’m just interested in why it’s not, and whether there are plans
to change it. We might be interested in sponsoring this work.

davenolan · November 9, 2012, 10:24am

Dave,

On Nov 9, 2012, at 1:14 PM, Dave Nolan wrote:

If you want the flexibility of DNS resolving and safeguard yourself
else. I’m just interested in why it’s not, and whether there are plans
to change it. We might be interested in sponsoring this work.

That’s the current (and maybe already “legacy”) design of nginx upstream
configuration. Unfortunately there’s no quick solution but we actually
appreciate your feedback a lot and will try to incorporate a better
upstream design in the future releases.

davenolan · November 9, 2012, 10:24am

On Nov 9, 2012, at 1:14 PM, Dave Nolan wrote:

to change it. We might be interested in sponsoring this work.

Can we get this conversation in nginx-inquiries at nginx dot com please?

davenolan · January 8, 2015, 5:13am

any update?

davenolan · June 4, 2013, 5:26pm

Andrew A. wrote in post #1083715:

Dave,

On Nov 9, 2012, at 1:14 PM, Dave Nolan wrote:

If you want the flexibility of DNS resolving and safeguard yourself
else. I’m just interested in why it’s not, and whether there are plans
to change it. We might be interested in sponsoring this work.

That’s the current (and maybe already “legacy”) design of nginx upstream
configuration. Unfortunately there’s no quick solution but we actually
appreciate your feedback a lot and will try to incorporate a better
upstream design in the future releases.

I was wondering if there were any updates on this in more recent nginx
versions (1.5.1 at the moment)?

This bug causes a lot of problems in cloud environments like AWS where
IPs can change at any moment.

davenolan · January 12, 2015, 1:28pm

Hello!

On Thu, Jan 08, 2015 at 05:13:42AM +0100, Miroslav S. wrote:

any update?

This is now available as a commercial feature in nginx+, see the
“resolve” parameter here:

http://nginx.org/en/docs/http/ngx_http_upstream_module.html#server

–
Maxim D.
http://nginx.org/