Upgrading Executable on the Fly - wrong docs?

piotr.dobrogost · February 9, 2013, 11:13pm

Hi!

After reading the section titled “Upgrading Executable on the Fly” in
the
docs (at Controlling nginx) I have an impression the
information given is wrong.
In the first bullet one reads
“Send the HUP signal to the old master process. The old process will
start
new worker processes without re-reading the configuration. (…)”
then in the second and third bullet one reads
“When the new master process exits, the old master process will start
new
worker processes.”

If the old master process already started new worker processes after it
had
received the HUP signal then it means it didn’t have to wait until the
new
master process exited, right? Doesn’t this contradict the subsequent
information that the old master process waits with starting new worker
processes until after the new master process exited?

Regards
Piotr Dobrogost

Posted at Nginx Forum:

piotr.dobrogost · February 11, 2013, 8:07am

On Sat, Feb 09, 2013 at 05:12:28PM -0500, piotr.dobrogost wrote:

worker processes."
The instructions in the bullets are not supposed to be executed in
a sequence. Instead, they document two possible actions to perform:

Start old workers with old configuration, then gracefully stop new
master/workers (bullet #1).
Stop new master/workers immediately (). Old master will restart
workers automatically when new master exits (bullet #2). () send
KILL to new workers if they don’t exit normally (bullet #3).

If the old master process already started new worker processes after it had
received the HUP signal then it means it didn’t have to wait until the new
master process exited, right? Doesn’t this contradict the subsequent
information that the old master process waits with starting new worker
processes until after the new master process exited?

I can see where your confusion comes from. How’s this instead?

http://pp.nginx.com/ru/libxslt/en/docs/control.html#upgrade

piotr.dobrogost · February 12, 2013, 9:02pm

Ruslan, thanks for quick reply.

I have some trouble comparing the new wording with the previous one as
it
looks like your change went live at
Controlling nginx so
I do not have the old one to compare any more

Neverthless I have some more comments on the new (current) one.

I think an error sneaked into the new version. The first bullet is now
“Send the HUP signal to the old master process. The old master process
will
start new worker processes without re-reading the configuration. After
that,
all new processes can be shut down gracefully, by sending the QUIT
signal to
the old master process.”
I think it should have been “(…) by sending the QUIT signal to the new
master process.” instead.

What I don’t understand is why the old master process does not re-read
the
configuration after receiving the HUP signal as at the top of the page
it’s
written
HUP (…), starting new worker processes with a new configuration, (…)
If the reason is because it had received the USR2 signal at the
beginning of
the whole procedure and this changed its state (it “remembers” receiving
the
USR2 signal) it should be explained.

Also, maybe I’m missing something but I think that the two bullets are
not
symmetrical without a reason. In the first bullet the QUIT signal is
used
whereas in the second bullet the TERM signal is used. I believe either
of
them could be used with the obvious difference of fast vs graceful
shutdown.
If it’s true (either could be used) then using different signals between
the
first and the second bullet is misleading.

Additionaly I have a question regarding the following fragment:
"In order to upgrade the server executable, the new executable file
should
be put in place of an old file first. After that USR2 signal should be
sent
to the master process. The master process first renames its file (…)
How can the master process rename its file if this file is already gone
i.e.
it had been replaced by the new executable?

Regards,
Piotr

Posted at Nginx Forum:

piotr.dobrogost · February 13, 2013, 11:33am

On 2/13/13 12:01 AM, piotr.dobrogost wrote:

Ruslan, thanks for quick reply.

I have some trouble comparing the new wording with the previous one as it
looks like your change went live at Controlling nginx so
I do not have the old one to compare any more

[…]

You do:
http://trac.nginx.org/nginx/log/nginx_org/xml/en/docs/control.xml

–
Maxim K.
+7 (910) 4293178

piotr.dobrogost · February 13, 2013, 8:39pm

Thanks for the link.

If the statement “If new processes (…)” is supposed to mean “If the
new
master process and the new worker processes started by it” then I would
use
the latter form as it doesn’t leave room for ambiguity.

Still all my questions from my previous post in this thread are valid.

Posted at Nginx Forum:

piotr.dobrogost · February 19, 2013, 12:55pm

On Tue, Feb 12, 2013 at 03:01:39PM -0500, piotr.dobrogost wrote:

Ruslan, thanks for quick reply.

I have some trouble comparing the new wording with the previous one as it
looks like your change went live at Controlling nginx so
I do not have the old one to compare any more

Already answered.

Neverthless I have some more comments on the new (current) one.

I think an error sneaked into the new version. The first bullet is now
“Send the HUP signal to the old master process. The old master process will
start new worker processes without re-reading the configuration. After that,
all new processes can be shut down gracefully, by sending the QUIT signal to
the old master process.”
I think it should have been “(…) by sending the QUIT signal to the new
master process.” instead.

Thanks for spotting this, the fixed version is already on site.

What I don’t understand is why the old master process does not re-read the
configuration after receiving the HUP signal as at the top of the page it’s
written
HUP (…), starting new worker processes with a new configuration, (…)
If the reason is because it had received the USR2 signal at the beginning of
the whole procedure and this changed its state (it “remembers” receiving the
USR2 signal) it should be explained.

HUP after USR2 is handled differently, exactly as documented.
When master process knows it’s “old” (i.e., upgrade procedure
is in progress), a request to start new worker processes is
interpreted as a rollback request – master starts new worker
and cache manager processes with an old configuration.

Also, maybe I’m missing something but I think that the two bullets are not
symmetrical without a reason. In the first bullet the QUIT signal is used
whereas in the second bullet the TERM signal is used. I believe either of
them could be used with the obvious difference of fast vs graceful shutdown.
If it’s true (either could be used) then using different signals between the
first and the second bullet is misleading.

These are two different procedudes with different properties.

In the first case, you restart old workers with an old configuration,
but let requests that are currently in-fly to be fully processed
(if you can tolerate this). There’s no interruption in handling
requests.

In the second case, you want to stop new workers right away (e.g.,
something really odd happened that you can’t tolerate even in-fly
requests to finish), and it requires only a single action from
you to roll back (or none at all if e.g. a new binary process
segfaults). But there’s a small window where connection attempts
may be rejected.

Of course one may picture down other procedures, like starting old
workers and immediately stopping new processes, but how this is
practically different from the first case? Or one can gracefully
stop new workers (new requests will be rejected, but those in-fly
will be serviced, potentially indefinitely), and only after that
old workers will be restarted and new requests will be handled
(sorry, but such a procedure doesn’t make any sense to me).

Additionaly I have a question regarding the following fragment:
"In order to upgrade the server executable, the new executable file should
be put in place of an old file first. After that USR2 signal should be sent
to the master process. The master process first renames its file (…)
How can the master process rename its file if this file is already gone i.e.
it had been replaced by the new executable?

Read further, it “renames its file with the process ID”, see
http://nginx.org/r/pid