Forum: Ruby Fun with finalizers

Posted by Garthy D (Guest)
on 2013-02-16 07:39
(Received via mailing list)
Hi all,

Just ran into something very interesting with finalizers. I've found a
workaround (it'll be obvious what it is from the code below), but I just
thought I'd share it for discussion's sake.

Consider the code below:

$fcount = 0

class A
   def initialize
   end
end

class B

   def initialize
   end

   def bar a
     ObjectSpace.define_finalizer(a, lambda {|oid| $fcount += 1})
     a = nil # xxx
     nil
   end

   def foo

     a = A.new
     bar a
     nil
   end
end

b = B.new
for i in 1 .. 10000
   GC.start
   b.foo
   GC.start
end
$stderr.print "Program ends. #{$fcount} finalizers called.\n"

All but one of the finalizers run at the point of the trace.

Now, comment the line marked with xxx. This shouldn't make any
difference- but it does. The program will report that 0 finalizers ran
at the point of the trace. You can confirm that the rest did run, but
they *only* ran at program exit, after the trace. Basically, the
resources are never released when finalizers are used. This is a big
problem in a long-running program.

If 10000 iterations isn't enough, you can always increase the counter.

Note that I am using Ruby 1.9.2p136, Linux. Other versions may behave
differently.

Why the code above? I was noticing in my code that finalizers were
*never* being run under any circumstances. The above is a stripped-down
set of code that acts similarly to mine.

 From a bit of research online, I've seen comments that say sometimes
values are left in registers, which affects GC. That seems fair enough
in general- but not here. There are 10000 objects here that aren't being
finalized- they're not in all in registers. If it's the stack, then the
first run should also have failed. It's not the return value, this is
nil. It's not the parameter coming in, the first test would have failed.

Is it the current scope? I have a feeling that, based on the one line
change I made, that the current scope is somehow being captured by the
finaliser, so that if "a" remains set, the finaliser holds on to it, and
the object is never released. That's just my theory- I could be wrong.
This situation is particularly bad if you want to set up a finaliser and
then immediately return the value (say, as a result of caching a value)-
the finaliser will never be called, because you can't clear the value
before returning it.

If you've followed me so far, you can probably guess the workaround-
call a separate method to set the finaliser, and clear the parameter
afterward in that call, then return to the caller with a nil return
value. It's annoying, but not too painful.

What I am incredibly curious about is why this happens in the first
place- and why there doesn't seem too much talk of this specific problem
when using finalizers online. Finalizers failing to work when used in
the current scope without explicitly clearing the object afterward seems
like the sort of problem other people should be running into more often.

It's bizarre. I'm wondering what everyone else thinks of it. Have I
missed something?

Garth
Posted by Matthew Kerwin (mattyk)
on 2013-02-16 11:00
(Received via mailing list)
Quick guess, it's the lambda. Replace it with #proc and try again?

Sent from my phone, so excuse the typos.
On Feb 16, 2013 4:39 PM, "Garthy D" 
<garthy_lmkltybr@entropicsoftware.com>
Posted by Garthy D (Guest)
on 2013-02-16 12:09
(Received via mailing list)
Hi Matthew,

Excellent thinking. I also thought it might be something along those
lines too. I tried various combinations as well: proc, Proc.new, I think
a return from method(), calls to a separate object; but there was no
impact on the result. If "a" isn't cleared, the object is held. I'm
guessing that there might be some way to say to not touch a thing in the
current scope, but I'm not sure *how* to specify it.

I also adapted the main program based on my experience with the code
below, and suddenly the finalizers were called. So it's the same type of
problem.

So there's a problem, and it's avoidable. I know the "what", but don't
know the "why". There is some subtlety I'm missing. Most interesting. :)

Cheers,
Garth
Posted by Robert Klemme (robert_k78)
on 2013-02-16 15:31
(Received via mailing list)
On Sat, Feb 16, 2013 at 12:08 PM, Garthy D
<garthy_lmkltybr@entropicsoftware.com> wrote:

> Excellent thinking. I also thought it might be something along those lines
> too.

To make it crystal clear: the reason is that there is a closure
involved. The closure will hold on to the object referenced by a on
method entry - unless, as you discovered, that reference is cleared.

Just in case and if you don't know, here's what a closure does:

irb(main):015:0> def f; x=0; lambda { x+=1 } end
=> nil
irb(main):016:0> g = f
=> #<Proc:0x802b7778@(irb):15 (lambda)>
irb(main):017:0> g.call
=> 1
irb(main):018:0> g.call
=> 2
irb(main):019:0> g.call
=> 3

The closure captures the current scope, i.e. all local variables.
This includes method arguments and "self" - and hence all member
variables of self as well:

irb(main):023:0> def f; @x = 0; lambda { @x += 1 } end
=> nil
irb(main):024:0> g = f
=> #<Proc:0x8029f934@(irb):23 (lambda)>
irb(main):025:0> g.call
=> 1
irb(main):026:0> g.call
=> 2
irb(main):027:0> g.call
=> 3
irb(main):028:0> g.call
=> 4


> I tried various combinations as well: proc, Proc.new, I think a return
> from method(), calls to a separate object; but there was no impact on the
> result. If "a" isn't cleared, the object is held.

I would be surprised if there was.  lambda and Proc both create a
closure.  Difference between lambda and Proc are in a different area:
http://stackoverflow.com/questions/1740046/whats-t...

> I'm guessing that there
> might be some way to say to not touch a thing in the current scope, but I'm
> not sure *how* to specify it.

There is no way to exclude that variable from the closure - other than
not passing it. But that would be pointless here. :-)

> I also adapted the main program based on my experience with the code below,
> and suddenly the finalizers were called. So it's the same type of problem.
>
> So there's a problem, and it's avoidable. I know the "what", but don't know
> the "why". There is some subtlety I'm missing. Most interesting. :)

Now you should know.

Kind regards

robert
Posted by Garthy D (Guest)
on 2013-02-17 04:11
(Received via mailing list)
Hi Robert,

Thankyou very much, yet again. An excellent and incredibly informative
response, as always.

The hole in my understanding (and what I had begun to suspect was the
case, and I think you have identified) was pretty-much here:

 > The closure captures the current scope, i.e. all local variables.

Before I had encountered the problem, my understanding was that the
closure would capture any referenced variables similarly to a
function/method call. I wasn't 100% sure of the mechanics, but I
believed that it "just happened". What I didn't understand was that it
did this by holding on to the entire scope- and just that one scope. The
distinction had not become apparent to me as my understanding did not
clash with what was actually happening when finalizers were not
involved. However, the differences that arise once finalizers enter the
picture are actually very significant.

I think I did not pick this up early as a consequence of this means that
most of the discussion online regarding Ruby finalizers either glosses
over this point- or flat out misses it. There are plenty of mentions of
not implicitly including the object being finalized in the finalizer
(by, say, expecting to be able to call a method on the finalized
object), but I'm not sure I've seen a mention of capturing the current
scope and needing to be careful with the visible variables in it.
However, a common solution seems to be to use a method to return the
finaliser proc itself, and I'd missed the distinction that by doing it
this way, the call is created with a different scope than that which
actually sets the finalizer itself. Thus the finalizer never even sees
the value being finalized, avoiding the problem nicely.

The first example you have given makes it completely clear what is
happening. Based on my previous understanding, I would be unsure of what
the output from "g.call" would be. My first two guesses would probably
have been an exception, or possibly one or zero, but at that point I'd
be questioning if my understanding was actually correct. Knowing what I
know now, the answer is obvious, even trivial. Note that if the example
didn't use an integer, but an object, it would have fit in with my
previous understanding. Only by being an immediate value did the flaw in
my previous understanding become apparent.

Thankyou for taking the time to put together yet another superb post for
the list. I am frequently in awe at the level of detailed knowledge you
have in some of the more complex mechanics in Ruby. I hope that people
encountering similar issues can also stumble across it, so that the post
ends up helping considerably more people than just myself.

Cheers,
Garth

On 17/02/13 01:00, Robert Klemme wrote:
 > On Sat, Feb 16, 2013 at 12:08 PM, Garthy D
 > <garthy_lmkltybr@entropicsoftware.com>  wrote:
 >
 >> Excellent thinking. I also thought it might be something along those
lines
 >> too.
 >
 > To make it crystal clear: the reason is that there is a closure
 > involved. The closure will hold on to the object referenced by a on
 > method entry - unless, as you discovered, that reference is cleared.
 >
 > Just in case and if you don't know, here's what a closure does:
 >
 > irb(main):015:0>  def f; x=0; lambda { x+=1 } end
 > =>  nil
 > irb(main):016:0>  g = f
 > =>  #<Proc:0x802b7778@(irb):15 (lambda)>
 > irb(main):017:0>  g.call
 > =>  1
 > irb(main):018:0>  g.call
 > =>  2
 > irb(main):019:0>  g.call
 > =>  3
 >
 > The closure captures the current scope, i.e. all local variables.
 > This includes method arguments and "self" - and hence all member
 > variables of self as well:
 >
 > irb(main):023:0>  def f; @x = 0; lambda { @x += 1 } end
 > =>  nil
 > irb(main):024:0>  g = f
 > =>  #<Proc:0x8029f934@(irb):23 (lambda)>
 > irb(main):025:0>  g.call
 > =>  1
 > irb(main):026:0>  g.call
 > =>  2
 > irb(main):027:0>  g.call
 > =>  3
 > irb(main):028:0>  g.call
 > =>  4
 >
 >
 >> I tried various combinations as well: proc, Proc.new, I think a 
return
 >> from method(), calls to a separate object; but there was no impact
on the
 >> result. If "a" isn't cleared, the object is held.
 >
 > I would be surprised if there was.  lambda and Proc both create a
 > closure.  Difference between lambda and Proc are in a different area:
 >
http://stackoverflow.com/questions/1740046/whats-t...
 >
 >> I'm guessing that there
 >> might be some way to say to not touch a thing in the current scope,
but I'm
 >> not sure *how* to specify it.
 >
 > There is no way to exclude that variable from the closure - other 
than
 > not passing it. But that would be pointless here. :-)
 >
 >> I also adapted the main program based on my experience with the code
below,
 >> and suddenly the finalizers were called. So it's the same type of
problem.
 >>
 >> So there's a problem, and it's avoidable. I know the "what", but
don't know
 >> the "why". There is some subtlety I'm missing. Most interesting. :)
 >
 > Now you should know.
 >
 > Kind regards
 >
 > robert
 >
 >
Please log in before posting. Registration is free and takes only a minute.
Existing account (Switch to SSL-encrypted connection)
NEW: Do you have a Google/GoogleMail or Yahoo account? No registration required!
Log in with Google account | Log in with Yahoo account
No account? Register here.