Fix sched_setaffinity(2) call failure

Hi,

‘cpu_set_t’ should be used instead of ‘long’ when calling
sched_setaffinity(2). Otherwise it may fail on some Linux systems.
Please see the attachment for more detail.

Regards,

Hi,

On Sat, Jan 14, 2012 at 2:16 AM, Maxim D. [email protected]
wrote:

According to sched_setaffinity(2) at least in kernels >= 2.6.9 it
should be ok to use any length as long as bits above kernel mask
used aren’t referenced.

I’ve tested it on Linux 2.6.9 (Red Hat Enterprise Linux AS release 4)
and
unfortunately sched_setaffinity(2) failed.

+++ b/src/os/unix/ngx_process_cycle.c Fri Jan 13 11:58:01 2012 -0500
“sched_setaffinity(0x%08Xl) failed”,
cpu_affinity);
}

Could you please test if it works for you?

Yes, it works on one of my Linux box which has two cores. But I have not
tested it on more other systems yet.

+++ nginx-1.1.12-bugfix/src/core/nginx.c 2012-01-12 11:06:00.000000000

We use cpu_set_t and CPU_* macros in the patch not only because it’s
more
generic and not depend on the implementation of the kernel (I guess the
code in Nginx now is inspired by some early kernel implementation,
i.e.
Linux 2.5.x), but also we will have no 32-core-limitation any more
(‘“worker_cpu_affinity” supports up to 32 CPU only’).

I agree that the set_cpu_affinity related code in Nginx core should be
platform-independent. But the code now is not (please look at those
NGX_HAVE_SCHED_SETAFFINITY macros). And I do recommend move the
implementation of ngx_set_cpu_affinity() to the os/ directory when
worker_cpu_affinity is supported on other platforms.

Thank you very much for the review :slight_smile:

Regards,

Hello!

On Thu, Jan 12, 2012 at 11:37:00PM +0800, Joshua Z. wrote:

Hi,

‘cpu_set_t’ should be used instead of ‘long’ when calling
sched_setaffinity(2). Otherwise it may fail on some Linux systems.
Please see the attachment for more detail.

According to sched_setaffinity(2) at least in kernels >= 2.6.9 it
should be ok to use any length as long as bits above kernel mask
used aren’t referenced.

I believe the real problem is that code uses 32 as cpusetsize,
while this is the size in bytes. Hence it references arbitrary
memory and this causes EINVAL to be returned by glibc if there are
bits referenced above kernel cpumask size, see
sysdeps/unix/sysv/linux/sched_setaffinity.c in glibc’s sources.

The following patch should be enough:

diff -r e1296af53cc0 src/os/unix/ngx_process_cycle.c
— a/src/os/unix/ngx_process_cycle.c Mon Dec 26 00:00:00 2011 +0400
+++ b/src/os/unix/ngx_process_cycle.c Fri Jan 13 11:58:01 2012 -0500
@@ -914,7 +914,10 @@
ngx_log_error(NGX_LOG_NOTICE, cycle->log, 0,
“sched_setaffinity(0x%08Xl)”, cpu_affinity);

  •    if (sched_setaffinity(0, 32, (cpu_set_t *) &cpu_affinity) == 
    

-1) {

  •    if (sched_setaffinity(0, sizeof(cpu_affinity),
    
  •                          (cpu_set_t *) &cpu_affinity)
    
  •        == -1)
    
  •    {
           ngx_log_error(NGX_LOG_ALERT, cycle->log, ngx_errno,
                         "sched_setaffinity(0x%08Xl) failed", 
    

cpu_affinity);
}

Could you please test if it works for you?

It passes my limited testing, but I wasn’t really able to
reproduce the problem without modifying the sched_setaffinity()
call in question to use bigger cpusetsize.

[…]

  • ngx_uint_t i, n;
    }
    […]

While I generally agree that using cpu_set_t and CPU_* macros
explicitly would be better, I don’t think we want to introduce
linux-centric code here in platform-independent code.

Instead we probably want to leave this code generic enough to be
useable with other scheduler interfaces, and convert the data
somewhere near platform-specific whatever_setaffinity() calls if
needed. (Likely unneeded for the Linux with the above patch.)

Maxim D.

Hello!

On Mon, Jan 16, 2012 at 01:52:09PM +0800, Joshua Z. wrote:

[…]

  •        == -1)
    

tested it on more other systems yet.
Ok, I’ll commit it then, as this looks like an obvious fix.

[…]

We use cpu_set_t and CPU_* macros in the patch not only because it’s more
generic and not depend on the implementation of the kernel (I guess the
code in Nginx now is inspired by some early kernel implementation, i.e.
Linux 2.5.x), but also we will have no 32-core-limitation any more
(’“worker_cpu_affinity” supports up to 32 CPU only’).

Are you facing the 32-core limit in practice?

I agree that the set_cpu_affinity related code in Nginx core should be
platform-independent. But the code now is not (please look at those
NGX_HAVE_SCHED_SETAFFINITY macros).

As of now, the only part which isn’t portable is
sched_setaffinity() call itself. Other parts (i.e. config
parsing) may be reused for other whatever_setaffinity() calls as
soon as we add support.

And I do recommend move the
implementation of ngx_set_cpu_affinity() to the os/ directory when
worker_cpu_affinity is supported on other platforms.

Yes, that’s basically what I suggest also. We want some generic
interface, and platform-specific wrappers which do needed
conversions/magic.

Maxim D.

Hi,

On Mon, Jan 16, 2012 at 6:51 PM, Maxim D. [email protected]
wrote:

    ngx_log_error(NGX_LOG_NOTICE, cycle->log, 0,

cpu_affinity);
[…]

OK. This fix looks good to me now. Just commit this piece of code is
enough.

Regards,