Forum: Ruby Minimizing memory allocations

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
E20e89d58211a3631842daecc1245de2?d=identicon&s=25 Ilmari Heikkinen (Guest)
on 2006-01-22 15:14
(Received via mailing list)
So there I was this morning, staring at an ObjectSpace counter telling
me that I'm allocating 1500 Arrays and 10000 Floats per frame. Which
pretty much ground my framerate to ground by requiring a 0.2s GC run
every other frame. So I decided to get down and rid my code of as many
allocations as possible.

The first thing I discovered was a bit of code looking like this (not
to mention that each was actually getting called several times per
frame due to a bug):

@mtime = ([@mtime] + ([@stroke||nil,
@fill||nil].compact+children).map{|c| c.mtime}).max

Quite unreadable, and it was responsible for a large part of the Array
allocations too. A quick change whittled Array allocation count for
that method to 0, with the price of making it less idiomatic:

children.each{|c|
  cm = c.mtime
  @mtime = cm if cm > @mtime
}
if @stroke
  sm = @stroke.mtime
  @mtime = sm if sm > @mtime
end
if @fill
  fm = @fill.mtime
  @mtime = fm if fm > @mtime
end

Now the Array allocations dropped down to hundreds, a much more
reasonable number, but still way too much compared to what was
happening in the frame. The only thing that should've changed was one
number. So the extra 500 Arrays were a bit of a mystery.

Some investigation revealed places where I was using
Array#each_with_index. Very nice, very idiomatic, very allocating a
new Array on each iteration. So replace by the following and watch the
alloc counts fall:

i = 0
arr.each{|e|
  do_stuff_with e
  i += 1
}

By doing that in a couple of strategic places and some other
optimizations, the Array allocation count fell to 150. Of which 90
were allocated in the object Z-sorting method, which'd require a C
implementation to get its allocation count to 0. The Array allocation
fight was heading towards diminishing returns, and my current scene
didn't need to use Z-sorting, so I turned my attention to the Floats.

By now, the Float count had also dropped a great deal, but it was
still a hefty 3000 Floats per frame. With each float weighing 16
bytes, that was nearly 3MB per second when running at 60fps. Searching
for the method that was allocating all those Floats, i ran into
something weird. #transform was allocating 6-32 Floats per call. And
it's one of the functions that get called for every scene object, in
every frame.  Also, it's written in C.

That left me stymied. Surely there must be some mistake, I thought,
the C function didn't seem to be allocating _any_ Ruby objects. But
little did I know.

The C function called the NUM2DBL-macro in several places to turn Ruby
numbers into doubles. Reading the source for NUM2DBL told that it
calls the rb_num2dbl C function. Which takes a Ruby number and returns
a C double. Reading the source to rb_num2dbl revealed this:

01361 double
01362 rb_num2dbl(val)
01363     VALUE val;
01364 {
01365     switch (TYPE(val)) {
01366       case T_FLOAT:
01367         return RFLOAT(val)->value;
01368
01369       case T_STRING:
01370         rb_raise(rb_eTypeError, "no implicit conversion to float
from string");
01371         break;
01372
01373       case T_NIL:
01374         rb_raise(rb_eTypeError, "no implicit conversion to float
from nil");
01375         break;
01376
01377       default:
01378         break;
01379     }
01380
01381     return RFLOAT(rb_Float(val))->value;
01382 }

rb_Float gets called on all Fixnums and Bignums, which there happened
to be quite a deal of in my scene state arrays. Checking out rb_Float
gave the explanation for the Float allocations:

01326     switch (TYPE(val)) {
01327       case T_FIXNUM:
01328         return rb_float_new((double)FIX2LONG(val));
01329
01333       case T_BIGNUM:
01334         return rb_float_new(rb_big2dbl(val));

In order to turn a Fixnum into a double, it's allocating a new Float!
With that figured out, I took and rewrote rb_num2dbl as rb_num_to_dbl,
this time handling Fixnums and Bignums as special cases as well:

double rb_num_to_dbl( VALUE val )
{
  switch (TYPE(val)) {
    case T_FLOAT:
      return RFLOAT(val)->value;

    case T_FIXNUM:
      return (double)FIX2LONG(val);

    case T_BIGNUM:
      return rb_big2dbl(val);

    case T_STRING:
      rb_raise(rb_eTypeError, "no implicit conversion to float from
string");
      break;

    case T_NIL:
      rb_raise(rb_eTypeError, "no implicit conversion to float from
nil");
      break;

    default:
      break;
  }
  return RFLOAT(rb_Float(val))->value;
}

The result? Float allocations fell to 700 per frame from the original
3000. And now I'm getting a GC run "only" every 36 frames. Not perfect
by any means, but a decent start.


Have stories of your own? Tips for memory management? Ways to track
allocations? Post them, please.

Cheers,
Ilmari
Fe9b2d0628c0943af374b2fe5b320a82?d=identicon&s=25 Eero Saynatkari (rue)
on 2006-01-22 16:34
Ilmari Heikkinen wrote:
> So there I was this morning, staring at an ObjectSpace counter telling
> me that I'm allocating 1500 Arrays and 10000 Floats per frame. Which
> pretty much ground my framerate to ground by requiring a 0.2s GC run
> every other frame. So I decided to get down and rid my code of as many
> allocations as possible.
>
> < snip due to ruby-forum restrictions />
>
> In order to turn a Fixnum into a double, it's allocating a new Float!
> With that figured out, I took and rewrote rb_num2dbl as rb_num_to_dbl,
> this time handling Fixnums and Bignums as special cases as well:
>
> < snip />
>
> The result? Float allocations fell to 700 per frame from the original
> 3000. And now I'm getting a GC run "only" every 36 frames. Not perfect
> by any means, but a decent start.

Nice! I wonder if this would be eligible for core patching?

> Have stories of your own? Tips for memory management? Ways to track
> allocations? Post them, please.

Nope, but I enjoyed reading this one, thanks!

> Cheers,
> Ilmari


E
E20e89d58211a3631842daecc1245de2?d=identicon&s=25 Ilmari Heikkinen (Guest)
on 2006-01-22 16:54
(Received via mailing list)
On 1/22/06, Ilmari Heikkinen <ilmari.heikkinen@gmail.com> wrote:
> pretty much ground my framerate to ground by requiring a 0.2s GC run

Argh, sorry, magnitude error. The correct GC run time is 0.02s. Not so
bad, but still a 60fps -> 20fps glitch.
Ec9233451f7c6ba37a83388b87a1f565?d=identicon&s=25 Gavin Kistner (Guest)
on 2006-01-22 18:13
(Received via mailing list)
On Jan 22, 2006, at 8:54 AM, Ilmari Heikkinen wrote:

> On 1/22/06, Ilmari Heikkinen <ilmari.heikkinen@gmail.com> wrote:
>> pretty much ground my framerate to ground by requiring a 0.2s GC run
>
> Argh, sorry, magnitude error. The correct GC run time is 0.02s. Not so
> bad, but still a 60fps -> 20fps glitch.

In a completely separate world (Lua code running under a scene graph
written in C++; no Ruby anywhere) I recently hit a place where I
thought I needed to allocate ~200 Vector objects per frame.

(I was using a recursive function to calculate 3D bezier curves[1],
which needed to allocate and preserve 4 new Vector objects each call.)

It was causing noticeable lurching when the GC kicked in
occasionally. I found two interesting things:

1) Running Lua's GC manually *every frame* resulted in better memory
performance and faster overall framerate than running it every 2 or
10 or 50 or 100 frames. My only (lame) guess was that waiting longer
yielded a larger memory pool too wade through when looking for items
to GC. (?)

2) Because I really didn't need to preserve the 200 Vectors from
frame to frame (the final results of the recursive calculation were
copied into the position vectors for points on the line), I was able
to remove the per-frame memory allocations altogether by abstracting
the Vector allocation into a pooled-vector manager. Doing this gave
me far-better frame rates than I was getting with the GC-every-frame
approach.

This isn't applicable specifically to Ruby, but applicable generally:
when you can't remove memory allocations, see if you can at least re-
use them.

In an attempt to make this post Ruby-specific, I give you a pooled
object manager that I've just written, based on the Lua version I
wrote at work. You create a pool by specifying an object that is the
template/factory, and a method to call on that object (defaults to
'new'). Every time you ask for an object, it will hand you one from
the pool, or create a new instance. The #reset method makes all items
in the pool re-usable again (i.e. call at the start of a new frame).


class ObjectPool
   def initialize( template, method=:new, template_in_pool=false )
     @template = template
     @method = method
     @pool = [ ]
     if template_in_pool
       @pool << template
     end
     reset
   end

   # Make all items in the pool available again
   def reset
     @next_available = 0
   end

   # Remove references to all items not currently in use
   def drain
     @pool[ @next_available..-1 ] = nil
   end

   # Return a new item from the pool, creating a new one if needed
   def next
     unless item = @pool[ @next_available ]
       @pool << ( item = @template.send( @method ) )
     end
     @next_available = @next_available + 1
     item
   end

   def inspect
     "<ObjectPool of #{@pool.size} #{@template}>"
   end
end

class Vector
   attr_accessor :x, :y, :z
   def initialize( x=0, y=0, z=0 ) @x, @y, @z = x,y,z end
   def clone() self.class.new( @x, @y, @z ) end
   def inspect() "<Vector:0x#{object_id.to_s(16)} #@x,#@y,#@z>" end
end

##################################################################
# Showing how to create a pool of class instances
##################################################################
pool = ObjectPool.new( Vector )
p pool
#=> <ObjectPool of 0 Vector>

3.times{ |i|
   v = pool.next
   v.x = v.y = v.z = i
   p v
}
#=> <Vector:0x195518 0,0,0>
#=> <Vector:0x195392 1,1,1>
#=> <Vector:0x19520c 2,2,2>

p pool
#=> <ObjectPool of 3 Vector>

pool.reset
3.times{ p pool.next }
#=> <Vector:0x195518 0,0,0>
#=> <Vector:0x195392 1,1,1>
#=> <Vector:0x19520c 2,2,2>

p pool
#=> <ObjectPool of 3 Vector>


##################################################################
# Showing how to create a pool based off of a template object
##################################################################
v = Vector.new( 1, 2, 3 )
pool2 = ObjectPool.new( v, :clone, true )
p pool2
#=> <ObjectPool of 1 #<Vector:0x32ad64>>

3.times{ p pool2.next }
#=> <Vector:0x1956b2 1,2,3>
#=> <Vector:0x194672 1,2,3>
#=> <Vector:0x1944ec 1,2,3>

pool2.reset
3.times{ p pool2.next }
#=> <Vector:0x1956b2 1,2,3>
#=> <Vector:0x194672 1,2,3>
#=> <Vector:0x1944ec 1,2,3>

p pool2
#=> <ObjectPool of 3 #<Vector:0x32ad64>>



[1] http://www.antigrain.com/research/adaptive_bezier/...
D812408537ac3a0fa2fec96eb8811559?d=identicon&s=25 John Carter (Guest)
on 2006-01-23 04:16
(Received via mailing list)
On Sun, 22 Jan 2006, Ilmari Heikkinen wrote:

> The result? Float allocations fell to 700 per frame from the original
> 3000. And now I'm getting a GC run "only" every 36 frames. Not perfect
> by any means, but a decent start.

Good Stuff.

> Have stories of your own? Tips for memory management? Ways to track
> allocations? Post them, please.

Ok, here goes...

Useful trick one...

Just because Ruby has GC, doesn't mean it knows what is Garbage, merely
what isn't currently referenced.

ie. Remember to actively drop references you will never need via things
like...

   $big_hairy_global = nil
     or
   @fat_instance_variable = nil

Trick Two...

Memoization

class Foo
   def initialize( thing)
   end
end

foo = Foo.new( thing)

becomes...

class Foo
   @@memo = Hash.new{|hash,key| hash[key] = Foo.new( key)}

   def create_foo( thing)
    @@memo[thing]
   end

   def initialize( thing)
   end

end


foo = Foo.create_foo( thing)


Trick 3

Sometimes it doesn't pay to iterate over each line.

Sometimes just read the whole ruddy file in in one bang shoot and regex
over the whole thing.

ie. Instead of...
open( 'foo.txt') do |inf|
   inf.each_line do |line|
     line.chomp!
     if line =~ /thingy/
      # do stuff
     end
   end
end

Try...

IO.read( 'foo.txt').scan(/thingy/){ #do stuff}

John Carter                             Phone : (64)(3) 358 6639
Tait Electronics                        Fax   : (64)(3) 359 4632
PO Box 1645 Christchurch                Email : john.carter@tait.co.nz
New Zealand

Carter's Clarification of Murphy's Law.

"Things only ever go right so that they may go more spectacularly wrong
later."

From this principle, all of life and physics may be deduced.
Be223e60c56535a0e465b84243aeb0d1?d=identicon&s=25 Timothy Goddard (Guest)
on 2006-01-24 17:28
(Received via mailing list)
I don't know anything about Lua, but Ruby is unlikely to see and speed
benefits from garbage collecting every frame unless you're deep in
swap. Ruby uses a "mark and sweep" garbage collection scheme, which I
believe means that garbage collecting time is mostly proportional to
the number of current, referenced objects, not the amount of junk left
behind.
Fe6a008c1e3065327d1f1b007d8f1362?d=identicon&s=25 Paul Brannan (Guest)
on 2006-01-26 16:57
(Received via mailing list)
On Wed, Jan 25, 2006 at 12:45:41AM +0900, Timothy Goddard wrote:
> I don't know anything about Lua, but Ruby is unlikely to see and speed
> benefits from garbage collecting every frame unless you're deep in
> swap. Ruby uses a "mark and sweep" garbage collection scheme, which I
> believe means that garbage collecting time is mostly proportional to
> the number of current, referenced objects, not the amount of junk left
> behind.

The marking time is proportional to the number of referenced objects,
and the sweeping time is proportional to the number of objects that get
swept.  In general, the longer the time between sweeps, the more objects
need to be swept.

Invoking the GC every frame probably won't improve average frame rate,
but it may help decrease the variance of the time it takes to process
each frame (so you get more consistent performance).

Paul
12271b6df73fe29930d65586be5a4a70?d=identicon&s=25 Dave Howell (Guest)
on 2006-01-26 18:46
(Received via mailing list)
On Jan 22, 2006, at 19:14, John Carter wrote:

>
>   end
>
> end
>
>
> foo = Foo.create_foo( thing)

Er, um, huh?

All foo-links are tapping into a global-to-class hash called @@memo.
OK...
Foo.new has been rendered useless, apparently torturing anybody who
forgets by returning nil as a non-error?
Foo.create_foo(thing) does, well, I have no idea. What's "thing"
supposed to be? And why is create_foo using square brackets?

Sigh. Sometimes Ruby is *too* idiomatic.

What *is* "memo-ization?"
912c61d9da47754de7039f4271334a9f?d=identicon&s=25 unknown (Guest)
on 2006-01-26 19:19
(Received via mailing list)
Quoting Dave Howell <groups@grandfenwick.net>:

> Sigh. Sometimes Ruby is *too* idiomatic.
>
> What *is* "memo-ization?"

It isn't actually a ruby-specific term:

http://en.wikipedia.org/wiki/Memoization

-mental
12271b6df73fe29930d65586be5a4a70?d=identicon&s=25 Dave Howell (Guest)
on 2006-01-26 20:49
(Received via mailing list)
On Jan 26, 2006, at 10:16, mental@rydia.net wrote:

> Quoting Dave Howell <groups@grandfenwick.net>:
>
>> Sigh. Sometimes Ruby is *too* idiomatic.
>>
>> What *is* "memo-ization?"
>
> It isn't actually a ruby-specific term:

I didn't think it was. That's why the comment and the question are in
separate paragraphs.

I might be able to guess what it is if I could read the Ruby code. But
I can't figure out what the code  does.
D812408537ac3a0fa2fec96eb8811559?d=identicon&s=25 John Carter (Guest)
on 2006-01-26 22:02
(Received via mailing list)
On Fri, 27 Jan 2006, Dave Howell wrote:

>> end
>>   end
>
> All foo-links are tapping into a global-to-class hash called @@memo. OK...
> Foo.new has been rendered useless, apparently torturing anybody who forgets
> by returning nil as a non-error?
> Foo.create_foo(thing) does, well, I have no idea. What's "thing" supposed to
> be? And why is create_foo using square brackets?
>
> Sigh. Sometimes Ruby is *too* idiomatic.
>
> What *is* "memo-ization?"


I sense deep confuzzlement so let's take that a lot slower.

OK, so this trick cannot be used everywhere. For example if I ask you to
give me a new pink car, I expect you to go off, build a new car, paint
it
pink and give it to me. If are ask you again, I expect you to do it
again
and give me another (different) brand new pink car.

Suppose in the particular application I had I didn't really care it was
a
different one, I just wanted a pink car NOW, fast. If it's the same one
as
you gave me last time, in this particular application, I don't care so
long as it isn't blue. I could have kept hold of the old pink car, but
that's too much work, sometimes I use green, sometimes red cars,
sometimes
pink. This is a garbage collected system, so it's easier for me grab a
new
pink car when I need one than to keep track of my old ones.

This form of memo-ization is just to skip the "new" when I already have
made one pink car, and just give exactly the same one as I made last
time.


Ok, now the little idiomatic bit.

Suppose @@memo is a brand spanking new empty Hash object.

What does @@memo["pink"] return?

By default it returns nil.

But you can make it return whatever you damn well feel like.

@@memo = Hash.new{|hash,key| hash[key] = Car.new(key)}

@@memo["pink"]

Will say, oo, I have no "pink" element,  I better go off and make one
then, fortunately I have a block on my "new" so I know how to do that.

ps.

If everybody in your country had a pink car, what would you have?

Answer:

A Pink Carnation.

Ok, so using pink cars was a "blooming" silly example... :-))


John Carter                             Phone : (64)(3) 358 6639
Tait Electronics                        Fax   : (64)(3) 359 4632
PO Box 1645 Christchurch                Email : john.carter@tait.co.nz
New Zealand

Carter's Clarification of Murphy's Law.

"Things only ever go right so that they may go more spectacularly wrong
later."

From this principle, all of life and physics may be deduced.
12271b6df73fe29930d65586be5a4a70?d=identicon&s=25 Dave Howell (Guest)
on 2006-01-26 22:20
(Received via mailing list)
On Jan 26, 2006, at 12:59, John Carter wrote:

> @@memo = Hash.new{|hash,key| hash[key] = Car.new(key)}
>
> @@memo["pink"]
>
> Will say, oo, I have no "pink" element,  I better go off and make one
> then, fortunately I have a block on my "new" so I know how to do that.
>
> ps.
>
> If everybody in your country had a pink car, what would you have?

Ha ha. Except that if everybody in my country had a
Car.create_car("pink") car, then the entire country would have exactly
one car. I can't figure out what it's good for; any example I can think
of in my head is just a complicated way of doing something simple.
Either "well, I'll just keep the object around in the first place," or
"no, I would have to have my very OWN pink car, not be FlexCar-ing it
with other parts of the program"
D812408537ac3a0fa2fec96eb8811559?d=identicon&s=25 John Carter (Guest)
on 2006-01-27 00:09
(Received via mailing list)
On Fri, 27 Jan 2006, Dave Howell wrote:

>> If everybody in your country had a pink car, what would you have?
>
> Ha ha. Except that if everybody in my country had a Car.create_car("pink")
> car, then the entire country would have exactly one car.

Exactly. Which is why it's a lousy example, except in the sense it
clearly
alerts you to the deficiency of the scheme.

Where I used the trick most recently was in parsing .c and .h files for
various useful info I needed to process the header files #include'd
as well.

ie. #include "pink_car.h"

Initially my code was just...

   subresult = ParseFile.new( "pink_car.h")
but then I noticed I was reparsing common header files far more often
than
I needed to. Hence memoization trick. A very simple two line change and
off I go again with a good burst of speed.

Of course after parsing all the files I need to remember to @@memo = nil
to actually release the memory...

Why use a class variable for this? It's Good Encapsulation. Who should
know whether you have _ever_ produced a pink car before? All the pink
car
users? Or the car factory? Answer, the Car Factory, ie. the Car class.


John Carter                             Phone : (64)(3) 358 6639
Tait Electronics                        Fax   : (64)(3) 359 4632
PO Box 1645 Christchurch                Email : john.carter@tait.co.nz
New Zealand

Carter's Clarification of Murphy's Law.

"Things only ever go right so that they may go more spectacularly wrong
later."

From this principle, all of life and physics may be deduced.
E34b5cae57e0dd170114dba444e37852?d=identicon&s=25 Logan Capaldo (Guest)
on 2006-01-27 02:28
(Received via mailing list)
On Jan 26, 2006, at 4:18 PM, Dave Howell wrote:

>>
>> ps.
>
>

Often memoization is used to speed up functions that are
referentially transparent

( Referentially transparent means that
given a function f, a == b implies that f(a) == f(b). In other words,
if you call the function with the same argument, more than once
you'll always get back the same answer. def f(x); x +1; end is
referentially transparent. Well, unless someone has been playing with
Integer#+. IO#gets for instance is not.)

when they take a long time to process. One example is computing the
fibonacci sequence (an equally silly example as the car.). If you use
memoization, all of a sudden the naive recursive implementation isn't
nearly as bad.

since f(0 or 1) = 1
     and f(x) = f(x - 1) + f(x - 2)
you'll notice that f(x) = f(x - 2) + f(x - 3) + f(x - 2). Since we
are memoizing, it only computes f(x-2) once, the second time it needs
f(x-2) its an O(1) hash lookup, and so on. Normally of course
recursive fibonacci is O(2**n).
A57dbd9d858a6995b9ca4110d871a256?d=identicon&s=25 Henry Maddocks (Guest)
on 2006-01-27 10:54
(Received via mailing list)
Ruby Hashes, Blocks & Memoization

http://opengl.geek.nz/ruby.html
622fa8560c82dfaa59c91ec75efb0c19?d=identicon&s=25 Alex Combas (Guest)
on 2006-01-28 01:22
(Received via mailing list)
Hello,
I would really like to learn more about memoization
but unfortunately the gem doesnt seem to install properly on
my system (Ubuntu Linux) and I was told that in order
to get downloaded gems to work I would probably have
to uninstall the apt versions of ruby and rebuild everything
from source.

I dont really want to do that, so I was wondering if
the memoization library is small enough that I could
just put a copy into my path and include it that way,
rather than as an installed gem.

Does that sound like a sane alternative, or is my only
option to rip the ruby deb's out of my system and
start from scratch?
Bc6d88907ce09158581fbb9b469a35a3?d=identicon&s=25 James Britt (Guest)
on 2006-01-28 01:37
(Received via mailing list)
Alex Combas wrote:
> Hello,
> I would really like to learn more about memoization
> but unfortunately the gem doesnt seem to install properly on
> my system (Ubuntu Linux) and I was told that in order
> to get downloaded gems to work I would probably have
> to uninstall the apt versions of ruby and rebuild everything
> from source.

Do you know this to be an Ubuntu issue, or just a problem you recently
noticed?  There have been, I believe, some problems with the gem servers
the last day or so, and it may worth taking another crack at gem
installation.


James
4299e35bacef054df40583da2d51edea?d=identicon&s=25 James Gray (bbazzarrakk)
on 2006-01-28 02:13
(Received via mailing list)
On Jan 27, 2006, at 6:19 PM, Alex Combas wrote:

> I dont really want to do that, so I was wondering if
> the memoization library is small enough that I could
> just put a copy into my path and include it that way,
> rather than as an installed gem.

You bet.  It's a single file.  Just download the source from
RubyForge and move /lib/memoize.rb into your path.  It will work just
fine.

James Edward Gray II
622fa8560c82dfaa59c91ec75efb0c19?d=identicon&s=25 Alex Combas (Guest)
on 2006-01-28 06:05
(Received via mailing list)
On 1/27/06, James Edward Gray II <james@grayproductions.net> wrote:
> You bet.  It's a single file.  Just download the source from
> RubyForge and move /lib/memoize.rb into your path.  It will work just
> fine.
>

Thanks James!
Wow, dont want to fanboy here, but wow.

$cat fib.rb
def fib(n)
  return n if n < 2
  fib(n-1) + fib(n-2)
end
puts fib(35)

$ time ruby fib.rb
9227465

real    0m41.121s
user    0m37.128s
sys     0m3.854s

$cat mem_fib.rb
require "memoize"
include Memoize
def fib(n)
  return n if n < 2
  fib(n-1) + fib(n-2)
end
memoize(:fib)
puts fib(35)

$ time ruby mem_fib.rb
9227465

real    0m0.011s
user    0m0.008s
sys     0m0.003s

Yikes! This is definitly something I'm going to need to learn to use,
and use
as often as possible. So anywho, took a quick peek at memoize.rb and was
again just blown away but this time because the entire file was only 39
lines
of code including white space! How can this be?

Unfortuantely, there are no comments to be had, so I was wondering
if someone would be so kind as to reply with inserted comments in the
code,
(yes I know ruby is supposed to be readable, but im new, ok? :-)
Just as a precaution so that I don't make any weird assumptions,
something I'm prone to do. This would be greatly appreciated.

Bored? Nothing to do on a Friday night?
Well here is some source. Comment away all you bored rubyists!

module Memoize
   MEMOIZE_VERSION = "1.2.0"

   # Memoize the method +name+.  If +file+ is provided, then the method
results
   # are stored on disk rather than in memory.  This consumes virtually
no
   # memory and is persistant.
   def memoize(name, file=nil)
      meth = method(name)

      if file
         cache = Hash.new.update(Marshal.load(File.read(file))) rescue
{}
      else
         cache = {}
      end

      if file
         (class << self; self; end).class_eval do
            define_method(name) do |*args|
               unless cache.has_key?(args)
                  cache[args] = meth.call(*args)
                  File.open(file, "wb+"){|f| Marshal.dump(cache, f) }
               end
               cache[args]
            end
         end
      else
         (class << self; self; end).class_eval do
            define_method(name) do |*args|
               if cache.has_key?(args)
                  cache[args]
               else
                  cache[args] ||= meth.call(*args)
               end
            end
         end
      end
      cache
   end
Bc6d88907ce09158581fbb9b469a35a3?d=identicon&s=25 James Britt (Guest)
on 2006-01-28 07:41
(Received via mailing list)
Alex Combas wrote:
>...


> Yikes! This is definitly something I'm going to need to learn to use, and use
> as often as possible. So anywho, took a quick peek at memoize.rb and was
> again just blown away but this time because the entire file was only 39 lines
> of code including white space! How can this be?
>
> Unfortuantely, there are no comments to be had, so I was wondering
> if someone would be so kind as to reply with inserted comments in the code,
> (yes I know ruby is supposed to be readable, but im new, ok? :-)
> Just as a precaution so that I don't make any weird assumptions,
> something I'm prone to do. This would be greatly appreciated.

Basically, the arguments to the target method are used as a hash key
into a cache. The cache hash is stored either in memory, or (optionally)
on disk.

If you give memoize a file name it will serialize the results cache to
and from disk using Ruby's Marshal code.

Otherwise, the cache just sits in memory.

The code defines a singleton method of the same name as the target
method. Calls to the target method are passed to this singleton method;
it has first crack at looking to see if there is a cached value.

If it has the value, then it is returned; all done.

Otherwise, the value is obtained by calling the original method defined
in the class source code, the results are cached (and the file updated,
if defined) and the value returned.

The neat part is the use of a singleton method to intercept calls to the
target method:

class Foo

   include Memoize

   def initialize
     memoize :baz
   end

   def baz val
     "Baz sez #{val}"
   end

end

When you call

  foo = Foo.new
  foo.baz( 'Hey.' )

the first baz invoked is the singleton method defined on foo.  If it
does not have a cached value, it calls the baz method defined in the Foo
class.

  puts foo.singleton_methods # baz


(But note that an in-memory hash will exist for each instance.  A file
allows for all instances to share values, and to have those values
persist from runs of the application, but may introduce race
conditions.)

--
James Britt

"Blanket statements are over-rated"
4299e35bacef054df40583da2d51edea?d=identicon&s=25 James Gray (bbazzarrakk)
on 2006-01-28 17:21
(Received via mailing list)
On Jan 28, 2006, at 12:39 AM, James Britt wrote:

> Basically, the arguments to the target method are used as a hash
> key into a cache. The cache hash is stored either in memory, or
> (optionally) on disk.

It's actually always in memory with the current implementation, and
can optionally be duplicated on disk.

> (But note that an in-memory hash will exist for each instance.  A
> file allows for all instances to share values, and to have those
> values persist from runs of the application, but may introduce race
> conditions.)

The current file support doesn't really get you all the way to
sharing a cache among objects, since it's only read from at the time
the object is converted to use memoization.

The current file implementation is only for persistence, in my
opinion, and really for just one process at that.

See the "Rethinking Memoization" for more details on all of this...

James Edward Gray II
4299e35bacef054df40583da2d51edea?d=identicon&s=25 James Gray (bbazzarrakk)
on 2006-01-28 17:30
(Received via mailing list)
On Jan 27, 2006, at 11:04 PM, Alex Combas wrote:

> module Memoize
>    MEMOIZE_VERSION = "1.2.0"

Set a constant to the module version, so user code can check it as
needed.

>
>    # Memoize the method +name+.  If +file+ is provided, then the
> method results
>    # are stored on disk rather than in memory.  This consumes
> virtually no
>    # memory and is persistant.
>    def memoize(name, file=nil)
>       meth = method(name)

Grab a reference to the original method.

>       if file
>          cache = Hash.new.update(Marshal.load(File.read(file)))
> rescue {}
>       else
>          cache = {}
>       end

Load an existing cache file, if requested and one exists.  Otherwise,
set the cache to an empty Hash.

>       if file
>          (class << self; self; end).class_eval do
>             define_method(name) do |*args|
>                unless cache.has_key?(args)
>                   cache[args] = meth.call(*args)
>                   File.open(file, "wb+"){|f| Marshal.dump(cache, f) }
>                end
>                cache[args]
>             end
>          end

This is the singleton class trick James Britt described.  It this
version, the cache is checked for an entry with all the arguments it
was just called with.  If it's not found, the original method is
triggered to add that entry to the cache.  The file is also modified
to contain the new entry.  Either way, the cache now has the needed
entry, which is returned.

>       end
Exact same thing, minus the file dump.  In fact, that whole if/else/
end chunk of code could be simplified to:

          (class << self; self; end).class_eval do
             define_method(name) do |*args|
                unless cache.has_key?(args)
                   cache[args] = meth.call(*args)
                   File.open(file, "wb+"){|f| Marshal.dump(cache,
f) } if file
                end
                cache[args]
             end
          end

>       cache

Return the cache we will use for user code to examine/modify, if
desired.

>    end

Hope that helps.

James Edward Gray II
Bc6d88907ce09158581fbb9b469a35a3?d=identicon&s=25 James Britt (Guest)
on 2006-01-28 17:36
(Received via mailing list)
James Edward Gray II wrote:
> On Jan 28, 2006, at 12:39 AM, James Britt wrote:
>
>> Basically, the arguments to the target method are used as a hash  key
>> into a cache. The cache hash is stored either in memory, or
>> (optionally) on disk.
>
>
> It's actually always in memory with the current implementation, and  can
> optionally be duplicated on disk.

Ah, I missed that. Thanks.

Really nice lib, and the use of singleton methods suggests other ideas.

Thanks,

James Britt
This topic is locked and can not be replied to.