Possible mutex starvation issue affects all nginx Linux versions

Here is a patch for a possible mutex starvation issue which affects all
nginx Linux versions.
Already solved for Windows since nginx 1.5.7.1 Caterpillar.
Can be reproduced when nginx reloads the config & worker holding mutex
dies
or hangs.

Fixed by Vittorio Francesco Digilio, commercially sponsered solution by
ITPP.
target source mainline 1.5.8 - 30-11-2013.
src/event/ngx_event_accept.c, line 402 was correctly found (starvation
fix)
but missed in:
src/event/ngx_event.c, line 259-260, 1 line added;
255 if (ngx_posted_accept_events) {
256 ngx_event_process_posted(cycle, &ngx_posted_accept_events);
257 }
258
259 if (ngx_accept_mutex_held) {
–+ ngx_accept_mutex_held=0;
260 ngx_shmtx_unlock(&ngx_accept_mutex);
261 }
262
263 if (delta) {
264 ngx_event_expire_timers();
265 }

Also applies to; src/os/win32/ngx_process_cycle.c
line 507-508, 3 lines added;
504 if (ngx_processes[n].handle != h) {
505 continue;
506 }
507
–+ if(*ngx_accept_mutex.lock==ngx_processes[n].pid) {
–+ *ngx_accept_mutex.lock=0;
–+ }
508 if (GetExitCodeProcess(h, &code) == 0) {
509 ngx_log_error(NGX_LOG_ALERT, cycle->log, ngx_errno,
510 “GetExitCodeProcess(%P) failed”,
511 ngx_processes[n].pid);
512 }

Posted at Nginx Forum:

Hello!

On Mon, Dec 02, 2013 at 07:15:52AM -0500, itpp2012 wrote:

but missed in:
263 if (delta) {
264 ngx_event_expire_timers();
265 }

This patch is wrong. The ngx_accept_mutex_held don’t need to be
unset here. The ngx_accept_mutex_held is used as an idicator that on
previous iteration the lock was held by the process, and unsetting
it will result in incorrect behaviour.

509 ngx_log_error(NGX_LOG_ALERT, cycle->log, ngx_errno,
510 “GetExitCodeProcess(%P) failed”,
511 ngx_processes[n].pid);
512 }

This patch is also wrong. In the current state of the win32
version it is not assumed that accept mutex is used at all. But
if it is, correct aproach to unlock shared memory mutexes on
abnormal process termination is to port (or move to
platform-independed place) the ngx_unlock_mutexes() function from
src/os/unix/ngx_process.c. It correctly uses
ngx_shmtx_force_unlock() to properly unlock shared memory mutexes
using atomic operations.

Note well that unlocking shared memory mutexes on abnormal
process termination is an emergency mechanism. If it actually
happens, it indicate that something is really wrong in other
places of the system.

Please also consider reading
Contributing Changes.


Maxim D.
http://nginx.org/en/donation.html

From the patch author:

Hi Maxim,
I apologize for my late reply: I just had now time to sort this out.

The short answer to your remarks is:
the first patch is partially correct, just incomplete, and could be
easily
completed with the addition of a call to ngx_disable_accept_events(…)
addressing the issue of 2 workers accepting connections at the same
time.
The second one is windows only and as correct and atomical as the unix
only
ngx_unlock_mutexes(), addressing the very same issue that one is for
(which
btw showed up during my tests) but being just one line long.

The long answer follows and, well, is quite long.

So before everything else, and for the sake of clarity:

accept mutex patching has been performed on codebase derived by nginx
1.5.6
after properly re-enabling it (the accept_mutex) removing the following
disable lines in ngx_event.c

#if (NGX_WIN32)

/*
 * disable accept mutex on win32 as it may cause deadlock if
 * grabbed by a process which can't accept connections
 */

ngx_use_accept_mutex = 0;

#endif

Thus submitted patch lines are not useless and are part of a larger
patch
included in Catepillar 1.5.7.1.
The latter (Caterpillar) fully works distributing correctly the workload
to
any number of configured process workers (not just one of the configured
number of them).

Now getting to specific proposed patches:

A) about the added line ngx_accept_mutex_held=0 in ngx_event.c

259 if (ngx_accept_mutex_held) {
–+ ngx_accept_mutex_held=0;
260 ngx_shmtx_unlock(&ngx_accept_mutex);
261 }

that is not wrong, it’s just incomplete.
In fact, first of all, it pairs with the similar one in

src/event/ngx_event_accept.c, line 402

which was patched in Caterpillar and that you too found later in your in
1.5.7.
The problem with this one (line 402) was that when
ngx_accept_mutex_held’s
process local variable and the accept_mutex global to all nginx
processes (being it allocated in shared memory) got out of synch.
This resulted (as it has been showed in tests) in more than a worker
process
locally ‘thinking’ to have the ownership of the accept_mutex at the same
time.
The latter interfers in the call of their respective
ngx_disable_accept_events(…) in turn leading, because of nasty time
synching, to no worker being able to enabling them anymore and amounting
to
all workers (so the whole server) unable to handle any further
connection.

The line above, between 259 and 260, tried to fix this too, but, there,
a
call to ngx_disable_accept_events(…) is missing.
In fact to be completely correct and not resulting in partially
incorrect
behaviour as you correctly pointed out, such a call must be added too.
This can be done moving that ‘if’ completely patched code

if (ngx_accept_mutex_held) {
ngx_disable_accept_events(cycle);
ngx_accept_mutex_held=0;
ngx_shmtx_unlock(&ngx_accept_mutex);
}

to ngx_event_accept.c (being ngx_disable_accept_events(…) internal
linkage) in a dedicated function, for example

void ngx_release_accept_mutex(ngx_cycle_t cycle){ / the if above */ }

which then gets called at 259 in ngx_event.c instead of the ‘if’ itself.
BTW to make the patch complete the ‘if’, in ngx_trylock_accept_mutex,
where
ngx_disable_accept_events() is called should be removed since
substituted by
that ngx_release_accept_mutex() call.

All of this is to avoid a partial incorrect behaviour that the ‘if’, as
it
is now,

259 if (ngx_accept_mutex_held) {
260 ngx_shmtx_unlock(&ngx_accept_mutex);
261 }

causes at the end of an iteration when the accept mutex was held:
the mutex is released in the ‘if’ above but the
ngx_disable_accept_events(…) won’t be called until the next iteration
with
ngx_trylock_accept_mutex(…).
Thus in the meanwhile another worker could succeed to acquire the accept
mutex, via ngx_trylock_accept_mutex(…), and start to accept
connections as
well ( ngx_enable_accept_events() is called in ngx_trylock_accept_mutex
when
mutex is acquired successfully).
This means that 2 workers at the same time can accept connections and
this
goes against the reason of an accept mutex itself reasonably not being
not
that the intended behaviour.

B) About the second proposed patch:
the 3 lines

if(*ngx_accept_mutex.lock==ngx_processes[n].pid) {
*ngx_accept_mutex.lock=0;
}

make a patch to prevent server’s accept_mutex deadlock when its owning
worker process dies unexpectedly (program crash while processing a
request/answer or task manager’s abrupt termination).
This is a scenario that occurred several times while testing and, when
showing up, lead to server’s workers unable to accept any further
connection.

Given that, unlike in the unix version with its ngx_unlock_mutexes(),
this
scenario is not addressed in the windows version and, as said, the above
lines are meant to fix it.
They are correct and thread-safe (the operation is atomic) in the
mentioned
specific scenario in which they are meant to be executed
because:

  1. they are invoked by master only and just when it detects that a
    worker
    process has terminated: worker process handle gets signaled on its
    termination and a WaitForMultipleObjects (in
    os/win32/ngx_process_cycle.c)
    waiting for them wakes up

  2. the if condition makes sure 2 things are true at the same time:
    a) the spinlock contains the pid of the dead worker and
    b) that implies *ngx_accept_mutex.lock != 0 (since no pid could be
    zero)

  3. considering that the accept_mutex is only acquired (during code flow)
    via

ngx_uint_t
ngx_shmtx_trylock(ngx_shmtx_t *mtx)
{
return (*mtx->lock == 0 && ngx_atomic_cmp_set(mtx->lock, 0,
ngx_pid));
}

where

#define ngx_atomic_cmp_set(lock, old, set)

(InterlockedCompareExchange((void **) lock, (void *) set, (void *)
old)

== (void *) old)

#endif

then, in the above scenario, *mtx->lock == pid (of the terminated worker
process) and pid !=0 imply that the ngx_atomic_cmp_set(…) (so its
InterlockedCompareExchange(…) ) can’t ever be called for any other
process
different than the (already) dead one (i.e. code can’t get to the atomic
InterlockedXXX operation at all, so no atomic operation is ever
executed…).

  1. furthermore accept_mutex is released (when appropriate) only via
    calling
    ngx_shmtx_unlock() by the owning worker process and in such scenario
    that
    process is obviously dead before having had such a chance (or the ‘if(
    )’
    fix’s condition wouldn’t trigger)

  2. no other worker can release a mutex it didn’t previously acquire (and
    couldn’t even if, mistakenly, tried to; just enough to look at
    ngx_shmtx_unlock() implementation to see why).

This last point shows also that, always in that scenario,
ngx_shmtx_unlock()
can’t ever be called by master to release the accept mutex (not being
its
owner).
Hopefully at this point it should be clear enough that releasing the
accept_mutex directly with

*ngx_accept_mutex.lock=0;

is as much safe as calling ngx_shmtx_force_unlock(), the choice of which
one
I just deem cosmetic (they’re functionally equivalent): my choice then
went
for the more performant, straight variable assignment.

NOt wanting to make this post longer than it is I conclude just
mentioning
that, as of now, the accept mutex is the only mutex living in shared
memory
so unlocking other possibly shared mutexes is as well unrequired.

I realize that for more easily readable code and for possible future
extensions (more mutexes in shared memory) it would be better to port
that
function but as of now it doesn’t really make a difference from the
correctness of program execution point of view.

The patch was still unpolished for full submission, sorry for that.

HTH and regards,

Vittorio F Digilio

Posted at Nginx Forum:

Hello!

On Wed, Dec 11, 2013 at 07:57:32AM -0500, itpp2012 wrote:

[…]

A) about the added line ngx_accept_mutex_held=0 in ngx_event.c

[…]

The line above, between 259 and 260, tried to fix this too, but, there, a
call to ngx_disable_accept_events(…) is missing.
In fact to be completely correct and not resulting in partially incorrect
behaviour as you correctly pointed out, such a call must be added too.

This is wrong as well. There is no reason to disable accept
events till we are sure that we wasn’t able to re-aquire accept
mutex before going into the kernel to wait for events. It’s just
waste of resources.

You are misunderstanding the code, probably becase the
ngx_accept_mutex_held variable has somewhat misleading name. It
is not expected to mean “we have the accept mutex locked”, it means
“we had the accept mutex locked on previous iteration, we have to
disable accept events if we won’t be able to re-aquire it”.

There is no need to fix it, it’s not broken.

[…]

B) About the second proposed patch:

[…]

Hopefully at this point it should be clear enough that releasing the
accept_mutex directly with

*ngx_accept_mutex.lock=0;

is as much safe as calling ngx_shmtx_force_unlock(), the choice of which one
I just deem cosmetic (they’re functionally equivalent): my choice then went
for the more performant, straight variable assignment.

Even considering the code currently there for win32 version, there
is no guarantee that reading *ngx_accept_mutex.lock is atomic.
While usually it is, in theory it can be non-atomic, and the code
will start doing wrong things due to the if() check unexpectedly
succeeding.

NOt wanting to make this post longer than it is I conclude just mentioning
that, as of now, the accept mutex is the only mutex living in shared memory
so unlocking other possibly shared mutexes is as well unrequired.

That’s not true. Something like

$ grep -r shmtx_lock src/

will give you an idea where shared memory mutexes are used in
nginx.

While it’s tricky to get all of this working on modern Windows
versions with ASLR, the accept mutex is certainly not the only
shared memory mutex used in nginx.

As I already wrote, I don’t object adding an unlock here, but I
would like to see the code which is correct and in-line with the
unix version of the code.


Maxim D.
http://nginx.org/