Fun with finalizers

Garthy_D · February 16, 2013, 7:39am

Hi all,

Just ran into something very interesting with finalizers. I’ve found a
workaround (it’ll be obvious what it is from the code below), but I just
thought I’d share it for discussion’s sake.

Consider the code below:

$fcount = 0

class A
def initialize
end
end

class B

def initialize
end

def bar a
ObjectSpace.define_finalizer(a, lambda {|oid| $fcount += 1})
a = nil # xxx
nil
end

def foo

 a = A.new
 bar a
 nil

end
end

b = B.new
for i in 1 … 10000
GC.start
b.foo
GC.start
end
$stderr.print “Program ends. #{$fcount} finalizers called.\n”

All but one of the finalizers run at the point of the trace.

Now, comment the line marked with xxx. This shouldn’t make any
difference- but it does. The program will report that 0 finalizers ran
at the point of the trace. You can confirm that the rest did run, but
they only ran at program exit, after the trace. Basically, the
resources are never released when finalizers are used. This is a big
problem in a long-running program.

If 10000 iterations isn’t enough, you can always increase the counter.

Note that I am using Ruby 1.9.2p136, Linux. Other versions may behave
differently.

Why the code above? I was noticing in my code that finalizers were
never being run under any circumstances. The above is a stripped-down
set of code that acts similarly to mine.

From a bit of research online, I’ve seen comments that say sometimes
values are left in registers, which affects GC. That seems fair enough
in general- but not here. There are 10000 objects here that aren’t being
finalized- they’re not in all in registers. If it’s the stack, then the
first run should also have failed. It’s not the return value, this is
nil. It’s not the parameter coming in, the first test would have failed.

Is it the current scope? I have a feeling that, based on the one line
change I made, that the current scope is somehow being captured by the
finaliser, so that if “a” remains set, the finaliser holds on to it, and
the object is never released. That’s just my theory- I could be wrong.
This situation is particularly bad if you want to set up a finaliser and
then immediately return the value (say, as a result of caching a value)-
the finaliser will never be called, because you can’t clear the value
before returning it.

If you’ve followed me so far, you can probably guess the workaround-
call a separate method to set the finaliser, and clear the parameter
afterward in that call, then return to the caller with a nil return
value. It’s annoying, but not too painful.

What I am incredibly curious about is why this happens in the first
place- and why there doesn’t seem too much talk of this specific problem
when using finalizers online. Finalizers failing to work when used in
the current scope without explicitly clearing the object afterward seems
like the sort of problem other people should be running into more often.

It’s bizarre. I’m wondering what everyone else thinks of it. Have I
missed something?

Garth

Garthy_D · February 16, 2013, 11:00am

Quick guess, it’s the lambda. Replace it with #proc and try again?

Sent from my phone, so excuse the typos.
On Feb 16, 2013 4:39 PM, “Garthy D”
[email protected]

Garthy_D · February 16, 2013, 12:09pm

Hi Matthew,

Excellent thinking. I also thought it might be something along those
lines too. I tried various combinations as well: proc, Proc.new, I think
a return from method(), calls to a separate object; but there was no
impact on the result. If “a” isn’t cleared, the object is held. I’m
guessing that there might be some way to say to not touch a thing in the
current scope, but I’m not sure how to specify it.

I also adapted the main program based on my experience with the code
below, and suddenly the finalizers were called. So it’s the same type of
problem.

So there’s a problem, and it’s avoidable. I know the “what”, but don’t
know the “why”. There is some subtlety I’m missing. Most interesting.

Cheers,
Garth

Garthy_D · February 16, 2013, 3:31pm

On Sat, Feb 16, 2013 at 12:08 PM, Garthy D
[email protected] wrote:

Excellent thinking. I also thought it might be something along those lines
too.

To make it crystal clear: the reason is that there is a closure
involved. The closure will hold on to the object referenced by a on
method entry - unless, as you discovered, that reference is cleared.

Just in case and if you don’t know, here’s what a closure does:

irb(main):015:0> def f; x=0; lambda { x+=1 } end
=> nil
irb(main):016:0> g = f
=> #<Proc:0x802b7778@(irb):15 (lambda)>
irb(main):017:0> g.call
=> 1
irb(main):018:0> g.call
=> 2
irb(main):019:0> g.call
=> 3

The closure captures the current scope, i.e. all local variables.
This includes method arguments and “self” - and hence all member
variables of self as well:

irb(main):023:0> def f; @x = 0; lambda { @x += 1 } end
=> nil
irb(main):024:0> g = f
=> #<Proc:0x8029f934@(irb):23 (lambda)>
irb(main):025:0> g.call
=> 1
irb(main):026:0> g.call
=> 2
irb(main):027:0> g.call
=> 3
irb(main):028:0> g.call
=> 4

I tried various combinations as well: proc, Proc.new, I think a return
from method(), calls to a separate object; but there was no impact on the
result. If “a” isn’t cleared, the object is held.

I would be surprised if there was. lambda and Proc both create a
closure. Difference between lambda and Proc are in a different area:

I’m guessing that there
might be some way to say to not touch a thing in the current scope, but I’m
not sure how to specify it.

There is no way to exclude that variable from the closure - other than
not passing it. But that would be pointless here.

I also adapted the main program based on my experience with the code below,
and suddenly the finalizers were called. So it’s the same type of problem.

So there’s a problem, and it’s avoidable. I know the “what”, but don’t know
the “why”. There is some subtlety I’m missing. Most interesting.

Now you should know.

Kind regards

robert

Garthy_D · February 17, 2013, 4:11am

Hi Robert,

Thankyou very much, yet again. An excellent and incredibly informative
response, as always.

The hole in my understanding (and what I had begun to suspect was the
case, and I think you have identified) was pretty-much here:

The closure captures the current scope, i.e. all local variables.

Before I had encountered the problem, my understanding was that the
closure would capture any referenced variables similarly to a
function/method call. I wasn’t 100% sure of the mechanics, but I
believed that it “just happened”. What I didn’t understand was that it
did this by holding on to the entire scope- and just that one scope. The
distinction had not become apparent to me as my understanding did not
clash with what was actually happening when finalizers were not
involved. However, the differences that arise once finalizers enter the
picture are actually very significant.

I think I did not pick this up early as a consequence of this means that
most of the discussion online regarding Ruby finalizers either glosses
over this point- or flat out misses it. There are plenty of mentions of
not implicitly including the object being finalized in the finalizer
(by, say, expecting to be able to call a method on the finalized
object), but I’m not sure I’ve seen a mention of capturing the current
scope and needing to be careful with the visible variables in it.
However, a common solution seems to be to use a method to return the
finaliser proc itself, and I’d missed the distinction that by doing it
this way, the call is created with a different scope than that which
actually sets the finalizer itself. Thus the finalizer never even sees
the value being finalized, avoiding the problem nicely.

The first example you have given makes it completely clear what is
happening. Based on my previous understanding, I would be unsure of what
the output from “g.call” would be. My first two guesses would probably
have been an exception, or possibly one or zero, but at that point I’d
be questioning if my understanding was actually correct. Knowing what I
know now, the answer is obvious, even trivial. Note that if the example
didn’t use an integer, but an object, it would have fit in with my
previous understanding. Only by being an immediate value did the flaw in
my previous understanding become apparent.

Thankyou for taking the time to put together yet another superb post for
the list. I am frequently in awe at the level of detailed knowledge you
have in some of the more complex mechanics in Ruby. I hope that people
encountering similar issues can also stumble across it, so that the post
ends up helping considerably more people than just myself.

Cheers,
Garth

On 17/02/13 01:00, Robert K. wrote:

On Sat, Feb 16, 2013 at 12:08 PM, Garthy D
[email protected] wrote:

Excellent thinking. I also thought it might be something along those
lines
too.

To make it crystal clear: the reason is that there is a closure
involved. The closure will hold on to the object referenced by a on
method entry - unless, as you discovered, that reference is cleared.

Just in case and if you don’t know, here’s what a closure does:

irb(main):015:0> def f; x=0; lambda { x+=1 } end
=> nil
irb(main):016:0> g = f
=> #<Proc:0x802b7778@(irb):15 (lambda)>
irb(main):017:0> g.call
=> 1
irb(main):018:0> g.call
=> 2
irb(main):019:0> g.call
=> 3

The closure captures the current scope, i.e. all local variables.
This includes method arguments and “self” - and hence all member
variables of self as well:

irb(main):023:0> def f; @x = 0; lambda { @x += 1 } end
=> nil
irb(main):024:0> g = f
=> #<Proc:0x8029f934@(irb):23 (lambda)>
irb(main):025:0> g.call
=> 1
irb(main):026:0> g.call
=> 2
irb(main):027:0> g.call
=> 3
irb(main):028:0> g.call
=> 4

I tried various combinations as well: proc, Proc.new, I think a
return
from method(), calls to a separate object; but there was no impact
on the
result. If “a” isn’t cleared, the object is held.

I would be surprised if there was. lambda and Proc both create a
closure. Difference between lambda and Proc are in a different area:

I’m guessing that there
might be some way to say to not touch a thing in the current scope,
but I’m
not sure how to specify it.

There is no way to exclude that variable from the closure - other
than
not passing it. But that would be pointless here.

I also adapted the main program based on my experience with the code
below,
and suddenly the finalizers were called. So it’s the same type of
problem.

So there’s a problem, and it’s avoidable. I know the “what”, but
don’t know
the “why”. There is some subtlety I’m missing. Most interesting.

Now you should know.

Kind regards

robert