SSL_STAPLING when network is unreachable

Hello,

I’ve encountered a problem with nginx 1.5.10.
I’m running nginx on a highly available system (2 cluster node).

When node1 fails, node2 is automatically coming into play. A few days
ago
the internet connection was bad - on both nodes. They could ping the
gateway
only sporadically.
Node2 became the active one and tried to start nginx. Nginx did not even
come up.

I replayed the whole scenario (switchover) with a working internet
connection. Everything is running perfect then.
But with a broken internet connection nginx does not start up. It’s
hanging.

The reason is ssl_stapling I found out. Even when I set resolver_timeout
to
5 seconds, nginx won’t come up within 5 seconds with an internet
connection
with high packet loss.

Unfortunately I cannnot use “ssl_stapling_file”. I tried fetching the
OCSP
response from globalsign but always get “error querying OCSP response”
from
globalsign’s ocsp server (but with godaddy it worked).
My cmd was: openssl ocsp -host ocsp2.globalsign.com -noverify -no_nonce
-issuer issuer.crt -cert domain.crt -url
http://ocsp2.globalsign.com/gsalphag2

So…it would be nice if nginx did not block on startup or if there was
a
setting that told nginx “you must startup within x seconds”.

For now I will remove ssl_stapling support altogether.

best regards,
Can Özdemir

Posted at Nginx Forum:

Hello!

On Wed, Feb 26, 2014 at 11:39:31AM -0500, mastercan wrote:

I replayed the whole scenario (switchover) with a working internet
connection. Everything is running perfect then.
But with a broken internet connection nginx does not start up. It’s
hanging.

The reason is ssl_stapling I found out. Even when I set resolver_timeout to
5 seconds, nginx won’t come up within 5 seconds with an internet connection
with high packet loss.

On startup, nginx does name resolution of various names in a
configuration files, using system resolver. This includes initial
resolution of OCSP responders if stapling is used. If your system
resolver doesn’t have internet access and blocks trying to resolve
names - so nginx will do.

Traditional approach to the problem is to use local caching DNS
server (which is less likely to fail than external services), and
to use IP addresses or /etc/hosts for critical things.

It’s also a good idea to have nginx running instead of trying to
start it in an emergency conditions. While nginx usually starts
just fine, it is designed to keep things running by all means, not
to start by all means. Startup may fail, e.g., due to failed DNS
resolution or a listen socket grabbed by some other process. In
contrast, if nginx was already started - it will keep running by
all means.


Maxim D.
http://nginx.org/

Hello Maxim,

On startup, nginx does name resolution of various names in a
configuration files, using system resolver. This includes initial
resolution of OCSP responders if stapling is used. If your system
resolver doesn’t have internet access and blocks trying to resolve
names - so nginx will do.

I see. But what is the parameter “resolver_timeout” for? I had 2
ssl_staple
directives in my config, and I set a resolver_timeout of 5 secs. I
thought
the blocking should not exceed 10 seconds then, assuming the resolving
is
done sequentially? It took more than 40 seconds to start though.

Traditional approach to the problem is to use local caching DNS
server (which is less likely to fail than external services), and
to use IP addresses or /etc/hosts for critical things.

That sounds good, but I’ve seen that the ocsp server has a TTL of 5
minutes
for its A records. So they seem to change frequently and caching them
would
-in this case- not help a lot.

It’s also a good idea to have nginx running instead of trying to
start it in an emergency conditions. While nginx usually starts
just fine, it is designed to keep things running by all means, not
to start by all means. Startup may fail, e.g., due to failed DNS
resolution or a listen socket grabbed by some other process. In
contrast, if nginx was already started - it will keep running by
all means.

Ok, that’s something I should consider. Keep nginx running on both
nodes. I
hope it doesn’t cause troubles if a web directory is empty and gets
filled
later on by mounting a DRBD device.

br,
Can

Posted at Nginx Forum:

Maxim D. Wrote:

It’s to configure timeout used by nginx’s own nonblocking resolver
(Module ngx_http_core_module) - that is, for name resolution done
by running nginx. To configure system resolver you should
use your system’s settings, usually /etc/resolv.conf.

(Actually, sole purpose of nginx’s own resolver is to be able to
resolve names when nginx is running, without blocking. It’s not
something possible when using system resolver, as it has only
blocking interface.)

Thanks a lot, Maxim! That clarifies things for me.

br
Can

Posted at Nginx Forum:

Hello!

On Wed, Feb 26, 2014 at 02:32:48PM -0500, mastercan wrote:

the blocking should not exceed 10 seconds then, assuming the resolving is
done sequentially? It took more than 40 seconds to start though.

It’s to configure timeout used by nginx’s own nonblocking resolver
(Module ngx_http_core_module) - that is, for name resolution done
by running nginx. To configure system resolver you should
use your system’s settings, usually /etc/resolv.conf.

(Actually, sole purpose of nginx’s own resolver is to be able to
resolve names when nginx is running, without blocking. It’s not
something possible when using system resolver, as it has only
blocking interface.)


Maxim D.
http://nginx.org/