Running out of memory during BER simulations

addis_a · November 25, 2014, 6:17pm

Hi list,

I’m currently performing BER measurements for an IEEE 802.15.4 system.
In
order to do so, I have created a flowgraph (in GRC/python) that is
executed
for different SNR values until a certain number of bits has been
processed.
Between every run, I call tb.stop() followed by tb.wait().
Unfortunately,
after a few runs (around 20), I get the following error message:

gr::vmcircbuf_sysv_shm: shmget (2): No space left on device
gr::vmcircbuf_sysv_shm: shmget (2): No space left on device
gr::vmcircbuf_sysv_shm: shmget (2): No space left on device
gr::buffer::allocate_buffer: failed to allocate buffer of size 64 KB
gr::vmcircbuf_sysv_shm: shmget (2): No space left on device
gr::vmcircbuf_sysv_shm: shmget (2): No space left on device
gr::vmcircbuf_sysv_shm: shmget (2): No space left on device
gr::buffer::allocate_buffer: failed to allocate buffer of size 64 KB
terminate called after throwing an instance of ‘std::bad_alloc’
what(): std::bad_alloc
Aborted (core dumped)

sysctl says:
kernel.shmall = 2097152
kernel.shmmax = 2147483648

I don’t really understand why this happens as my memory usage is very
stable and < 20% of total RAM throughout the simulation. In my
understanding, calling, tb.stop() and tb.wait() should delete anything
the
flowgraph allocated before. I also noticed that the block numbers (which
are displayed because I call set_min_output_buffer() ) are monotonically
increasing during the simulation even though the top block is stopped
multiple times. This might be completely unrelated, however it indicates
that my understanding might not be correct.

Any ideas?

–Felix W.

Felix_W · November 25, 2014, 6:39pm

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi Felix,

On 11/25/2014 06:15 PM, Felix W. wrote:

Between every run, I call tb.stop() followed by tb.wait().
Unfortunately, after a few runs (around 20), I get the following
error message:

gr::vmcircbuf_sysv_shm: shmget (2): No space left on device
I have a suspicion.
First of all, I know this sounds basic, but you’re not using a 32bit
GR on your 64 bit machine, are you? (that would explain running out of
RAM faster, just because process memory is so very limited for 32bit
processes)

then: vmcircbuf is one of the things I always was kind of hesitant to
touch (or even try to understand in depth), just because it deals with
a lot of POSIX/OS specifics that I’m not an expert in, but:

Maybe tb.stop()/wait doesn’t actually successfully unmap the shared
memory segments of the buffers; there’s a global maximum of segments,
and it 4096 by default (/proc/sys/kernel/shmmni). However, this
shouldn’t be the problem at hand: there’s an error number for this
condition. And it would be: ENOSPC. Great. The same thing as for “No
space left on device”; Thank you, Posix.1…

Cheers,
Marcus
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1

iQEcBAEBAgAGBQJUdL5ZAAoJEAFxB7BbsDrLaIcH/3GZa562FhMLZ7vqafSEQnBG
o3DWr54Wmq1NFbWGVuLTdT4+QFFn5UN5s7RKCdmhJ+KVeqxisV/nCSbH/WfShx8Y
DIq5o28BWsudwNsxtkq94mo57ELgj27fnHItIthqSsPGcUuIX4xszL7YTxsD++ai
Fz6wikE7rt0+01sP5OeIXKpJkXAvWB7VLX+M89tlDCWceF9Nr0nJCleLZCLXSeIq
0n+u7RR3iXU6+RSyDLgK9KNCtNiUyb0p24L4m+jqsAQ/IfT6J/Ip0X5CFLoXg1A3
ocBj6+kT1bq9aztRE2j92ZLVk//CKiuWANCo2lahNUatVehQJ0kANT2DDwp8rNU=
=32+6
-----END PGP SIGNATURE-----

Felix_W · November 25, 2014, 6:46pm

I built GR from source on the machine and the machine is running Ubuntu
14.04 64-bit. So I guess my GR is 64-bit, too.

Actually, while watching the simulation running at the moment, I noticed
that memory usage is indeed increasing slowly (but at a rate that
wouldn’t
fill up the system memory in days…).

2014-11-25 18:37 GMT+01:00 Marcus Müller [email protected]:

Felix_W · November 25, 2014, 6:50pm

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Does increasing kernel.shmmni using sysctl to let’s say 16k improve
the situation?

On 11/25/2014 06:37 PM, Marcus M. wrote:

you’re not using a 32bit GR on your 64 bit machine, are you? (that
segments, and it 4096 by default (/proc/sys/kernel/shmmni).
However, this shouldn’t be the problem at hand: there’s an error
number for this condition. And it would be: ENOSPC. Great. The same
thing as for “No space left on device”; Thank you, Posix.1…

Cheers, Marcus

_______________________________________________ Discuss-gnuradio
mailing list [email protected]
Discuss-gnuradio Info Page

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1

iQEcBAEBAgAGBQJUdMD0AAoJEAFxB7BbsDrL88sIAKE8Ic6WLuvIXcjVl9Rtiwrk
OMGx6iL07PpWDlZBJfU5Twy0O4T+VYSQiwp5DtyMAe60zUH1uBP3eUQ/e9fyFLsN
VV89J9MB4lGlhg8m73kjj8yV2RrM9TttccU9yXHyUiVTQGQ838GRE5QS/CULv3bv
A2hZXX+IVfZLPuxmbd1fuiVDHcX5fxrlPHgLcZRWtOGJkVuvkRYip7mwtHInaVFD
xO6bI8CuujYUAJPKvUbRe3QIyKkxzLi2SPe4Vzn52KqAagwVy1fb8BK109fQ97fU
EPDMedpv5sTbRcB9mhY8SSi3bXaY01C5bLQIT8mw61Tp4D2/AUVdXSJykEc/0HQ=
=o6F6
-----END PGP SIGNATURE-----

Felix_W · November 25, 2014, 7:30pm

That fixed it! Thanks!!

2014-11-25 18:48 GMT+01:00 Marcus Müller [email protected]:

Felix_W · November 26, 2014, 3:31pm

Hi Everybody,

I had a very interesting dive into vmcircbuffer_sysv_shm today.
Questions that arose from that are:
a) why is a SYSV circbuffer implementation the default one on my linux
3.17 box?
b) SYSV shared memory segments have a flag that should tell the OS to
release the segment as soon as its global reference count goes to 0. If
I read vmcircbuffer_sysv_shm.cc correctly, it’s not getting set for all
segments. That is what could have caused Felix’ problems. Is that a bug?
c) I think I might need someone to explain to me why our circular
buffers depend on shared memory – can’t one just mmap() anonymously
without generating shared memory handles underneath?
d) what’s the order in which circbuffer implementations are chosen?

Greetings,
Marcus

Felix_W · November 26, 2014, 3:59pm

On 11/26/2014 03:30 PM, Marcus M. wrote:

c) I think I might need someone to explain to me why our circular
buffers depend on shared memory – can’t one just mmap() anonymously
without generating shared memory handles underneath?

Despite what the header says, the sysv implementation doesn’t call
mmap().
Which I say with a 80% confidence, because that’s as far as I’ll go with
the circbuff stuff.

Felix_W · November 26, 2014, 3:37pm

On 11/26/2014 09:30 AM, Marcus M. wrote:

c) I think I might need someone to explain to me why our circular
buffers depend on shared memory – can’t one just mmap() anonymously
without generating shared memory handles underneath?
d) what’s the order in which circbuffer implementations are chosen?

And the answers are here:

github.com

gnuradio/gnuradio/blob/master/gnuradio-runtime/lib/vmcircbuf.cc#L104


      
          
          
    return result;
          }
          
          
void vmcircbuf_sysconfig::set_default_factory(vmcircbuf_factory* f)
          {
              gr::vmcircbuf_prefs::set(FACTORY_PREF_KEY, f->name());
              s_default_factory = f;
          }
          
          

          
// ------------------------------------------------------------------------
          //		    test code for vmcircbuf factories
          // ------------------------------------------------------------------------
          
          
static void init_buffer(const vmcircbuf& c, int counter, int size)
          {
              unsigned int* p = (unsigned int*)c.pointer_to_first_copy();
              for (unsigned int i = 0; i < size / sizeof(int); i++)
                  p[i] = counter + i;
          }

If there is interest, I’ll send in a patch to move:

result.push_back (gr::vmcircbuf_mmap_tmpfile_factory::singleton());

to the top of the list of factories.

Philip

Felix_W · November 26, 2014, 5:10pm

Hi Martin,

Despite what the header says, the sysv implementation doesn’t call mmap().
I realize the documentation comments of all the unix circbuffers are
just copypasta…
Which I say with a 80% confidence, because that’s as far as I’ll go with
the circbuff stuff.
I know that “this source code reeks of quicksand” feeling…

We also have mmap()ing implementations, I guess:
vmcircbuf_mmap_shm_open and vmcircbuf_mmap_tmpfile; of which only the
second doesn’t use shared memory, which is the last one in the
preference list, but is the one that I’d actually use. Also, in that one
we get a file handler to a temporary file, unlink that directly after,
and just use that handler to be able to ftruncate/mmap memory. which we
could likely also do by just mmap(…, size, … ,MAP_ANONYMOUS, -1,

So I’m getting more and more confused[1].

Greetings,
Marcus

[1] that doesn’t happen very often. I have a fairly high level of base
confusion…