Memory leak on 64bit architecture

Hello,

Using my “booh-classifier” program, a rather severe memory leak is
exhibited on 64-bit architecture (both debian and mandriva, with
versions for example ruby-1.8.7-9p174 and ruby-gtk2-0.19.2) but not on
32-bit architecture.

I’ve tried to write a small test program, and in it the memory leak
doesn’t exhibit.

At that point, I’m looking for advices as to what would be the best
approach to track down this issue (what tools would make it easy to
track/spot or reproduce with a small program)?

Thanks.


Guillaume C. - http://zarb.org/~gc/

2009/10/20 Guillaume C. [email protected]

Hello,

Using my “booh-classifier” program, a rather severe memory leak is
exhibited on 64-bit architecture (both debian and mandriva, with
versions for example ruby-1.8.7-9p174 and ruby-gtk2-0.19.2) but not on
32-bit architecture.

Do you have any pointer where booh-classifier might be leaking ?

And how to check if it leaks memory ?

I’ve tried to write a small test program, and in it the memory leak
doesn’t exhibit.

At that point, I’m looking for advices as to what would be the best
approach to track down this issue (what tools would make it easy to
track/spot or reproduce with a small program)?

Sorry, I never had to track down a memory leak.

regards

Simon A.

On Tue, Oct 20, 2009 at 4:46 PM, Simon A. [email protected] wrote:

2009/10/20 Guillaume C. [email protected]

Hello,

Using my “booh-classifier” program, a rather severe memory leak is
exhibited on 64-bit architecture (both debian and mandriva, with
versions for example ruby-1.8.7-9p174 and ruby-gtk2-0.19.2) but not on
32-bit architecture.

Do you have any pointer where booh-classifier might be leaking ?

I’m pretty much convinced it’s rg2, so it should be in rg2’s general
object binding system, but then…

And how to check if it leaks memory ?

Start it once first, by invoking “booh-classifier”. In
Edit/Preferences, set the cache memory use to 100 MB.

Stop it, start it again by invoking “booh-classifier -v3
/path/to/photos” where the path is where you store tens to hundreds of
photos (booh-classifier will recurse, so the path may be the top
directory of some classified photos already in subdirectories).

Wait for the loading to finish, to stop the console flooding. Then
when navigating in photos (left/right keyword arrows), more photos
will be loaded and occasionally, the GC will be triggered to help
ruby’s GC to notice that some rg2 objects are not references anymore,
before the size of the process grows too much (why the GC needs to be
fired explicitely is a separate, old issue, which remained
unanswered).

On an x86 architecture (for example, i686 Intel(R) Pentium(R) 4 CPU
3.00GHz GNU/Linux), you’ll regularly notice in the console something
looking like:

VmSize: 216148
too much RSS, stopping preloading, triggering GC
GC in 0.184628 s
VmSize: 67052

E.g. triggering the GC allows to reclaim memory and the process not to
hammer the system too much (second VmSize figure goes below 100M)

On an x86_64 architecture (for example, x86_64 Genuine Intel(R) CPU
3.20GHz GNU/Linux):

VmSize: 335124
too much RSS, stopping preloading, triggering GC
GC in 0.083613 s
VmSize: 210428

then a while later (not necessarily immediately, so make sure to
navigate more than a couple of photos, if the problem doesn’t exhibit
quickly):

VmSize: 254844
too much RSS, stopping preloading, triggering GC
GC in 0.068304 s
VmSize: 254844

then:

VmSize: 257780
too much RSS, stopping preloading, triggering GC
GC in 0.065692 s
VmSize: 257780

etc

Of course, you can also witness in “top” that the process will grow
again and again and again.

I’ve tried to write a small test program, and in it the memory leak
doesn’t exhibit.

At that point, I’m looking for advices as to what would be the best
approach to track down this issue (what tools would make it easy to
track/spot or reproduce with a small program)?

Sorry, I never had to track down a memory leak.

Usually, valgrind is helpful (at least it is very much for me) but
it’s always harder with non native programs (big slowdown, false
positives, difficulty of tracking real origin of allocation) so I was
hoping for some techniques. E.g. it should be a usual process to debug
the interpreter’s allocation system and I cannot imagine ruby
developers don’t have tools to assist. Maybe Kouhei will know some. (I
confess I didn’t do an extensive research)


Guillaume C. - Guillaume Cottenceau

I tried on debian Sid 64, ruby 1.9 (I had to make a few changes).

what changes, btw?

vmsize at startup is around 350 MB, so, way after the 100 MB limit.

yes, at startup, process size is much higher on 64bit than on 32bit. I
don’t know why.

I see what you describe.

However, I replaced line 525

  • free_cache([])
  • gc
  • get_mem

And I was not able to reproduce the problem.

ah, good to know. thanks.

ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference


ruby-gnome2-devel-en mailing list
[email protected]
ruby-gnome2-devel-en List Signup and Options


Guillaume C. - Guillaume Cottenceau

2009/10/21 Guillaume C. [email protected]

Do you have any pointer where booh-classifier might be leaking ?

I’m pretty much convinced it’s rg2, so it should be in rg2’s general
object binding system, but then…

I meant where in booh-classifier you would think it was happening.

Wait for the loading to finish, to stop the console flooding. Then

too much RSS, stopping preloading, triggering GC
etc

Of course, you can also witness in “top” that the process will grow
again and again and again.

I tried on debian Sid 64, ruby 1.9 (I had to make a few changes).

vmsize at startup is around 350 MB, so, way after the 100 MB limit.

I see what you describe.

However, I replaced line 525

  • free_cache([])
  • gc
  • get_mem

And I was not able to reproduce the problem.

See if it works for you too.

Why it would do that on 64 bits, and not 32 is a mistery.

Simon A.