`finalize' method?

dasch · November 25, 2005, 2:21pm

Yeah, it’s me again, your friendly neighbourhood power-suggester!

I’m wondering why there isn’t a finalize' method classes can define, that will be called immediately before instances of that class are garbage collected? I'm aware of ObjectSpace.define_finalizer, but it seems un-OO, and it doesn't seem to work when being called from within the instance methods. It also seems as if the Proc sent todefine_finalizer’ isn’t called until after the object has been
destroyed.

class Klass
def finalize
# finalize something
end
end

A use case (the best I can come up with at the moment):

class Contact
def initialize(filename)
@filename = filename
parse_xml_file(@filename)
end

 def finalize
   save_as_xml(@filename)
 end

end

Cheers,
Daniel

dasch · November 25, 2005, 2:25pm

dasch wrote:

Yeah, it’s me again, your friendly neighbourhood power-suggester!

I’m wondering why there isn’t a finalize' method classes can define, that will be called immediately before instances of that class are garbage collected? I'm aware of ObjectSpace.define_finalizer, but it seems un-OO, and it doesn't seem to work when being called from within the instance methods. It also seems as if the Proc sent todefine_finalizer’ isn’t called until after the object has been
destroyed.

That’s correct, it prevents problems like the following code:

class Klass
def finalize
$global_reference = self
end
end

dasch · November 25, 2005, 2:49pm

Jim W. wrote:

That’s correct, it prevents problems like the following code:

class Klass
def finalize
$global_reference = self
end
end

In that case I’d expect $global_reference to be nil after the `finalize’
method is done executing, since the Klass instance has already been
deemed ready for destruction.

Cheers,
Daniel

dasch · November 25, 2005, 4:30pm

Jim W. wrote:

references may be created deep inside other methods invoked by finalize.
proc { name }
end
end

k = NamedObject.new(“Billy”)
p = k.finalize # Here p is a proc that has a binding that includes
# a “finalized” object.

The point is that the current mechanism avoids all these questions by
only calling the finalizer /after/ the object has been destroyed.

I just think it’s weird that PHP has destructors and Ruby doesn’t

PHP: Constructors and Destructors - Manual

I’m not really into C, neither am I familiar with the Ruby
implementation, but doesn’t all variables just reference an object? So
somewhere, an object is stored, and each variable referencing it really
only holds that object’s id. I imagine the `finalize’ method could be
run, and afterwards, the object could be replaced with nil, maybe
retaining the old object id. That way any variables that point to the
object will be nil.

Now, I don’t know if that’s how it works, but it sounds logical to me,
speaking as a guy who’s never taken any programming lessons or
programmed/scripted using other languages than PHP, JavaScript and
Ruby…

Cheers,
Daniel

dasch · November 25, 2005, 4:02pm

dasch wrote:

In that case I’d expect $global_reference to be nil after the `finalize’
method is done executing, since the Klass instance has already been
deemed ready for destruction.

“after the ‘finalize’ method is done” => Tricky.

How would this be implemented? Seems to me Ruby would have to scan all
of the current image to determine if any new references were created
during the execution of the finalize method. Especially since the
references may be created deep inside other methods invoked by finalize.

And it might be more subtle than just checking for variable bindings.
Consider:

class NamedObject
attr_reader :name
def initialize(name)
@name = name
end
def finalize
proc { name }
end
end

k = NamedObject.new(“Billy”)
p = k.finalize # Here p is a proc that has a binding that includes
# a “finalized” object.

The point is that the current mechanism avoids all these questions by
only calling the finalizer /after/ the object has been destroyed.

dasch · November 25, 2005, 4:38pm

On Sat, 26 Nov 2005, Daniel S. wrote:

I just think it’s weird that PHP has destructors and Ruby doesn’t

PHP: Constructors and Destructors - Manual

I find this tricky too: the only model I can think of is that where
you need destructors, your constructor desstructs, and you get at
the object with a block, in the style of open(“file”){|f| …}; if
you see what I mean.

I’m not really into C, neither am I familiar with the Ruby implementation, but
doesn’t all variables just reference an object? So somewhere, an object is
stored, and each variable referencing it really only holds that object’s id. I
imagine the `finalize’ method could be run, and afterwards, the object could
be replaced with nil, maybe retaining the old object id. That way any
variables that point to the object will be nil.

But there is only one nil in the system, so it can’t have many ids.

Now, I don’t know if that’s how it works, but it sounds logical to me,
speaking as a guy who’s never taken any programming lessons or
programmed/scripted using other languages than PHP, JavaScript and Ruby…

Cheers,
Daniel

    Hugh

dasch · November 25, 2005, 5:11pm

Hugh S. wrote:

On Sat, 26 Nov 2005, Daniel S. wrote:

I just think it’s weird that PHP has destructors and Ruby doesn’t

PHP: Constructors and Destructors - Manual

I find this tricky too: the only model I can think of is that where
you need destructors, your constructor desstructs, and you get at
the object with a block, in the style of open(“file”){|f| …}; if
you see what I mean.

No, not really. I really can’t stretch enough that I’m a noob…

I’m not really into C, neither am I familiar with the Ruby implementation, but
doesn’t all variables just reference an object? So somewhere, an object is
stored, and each variable referencing it really only holds that object’s id. I
imagine the `finalize’ method could be run, and afterwards, the object could
be replaced with nil, maybe retaining the old object id. That way any
variables that point to the object will be nil.

But there is only one nil in the system, so it can’t have many ids.

Could the object then by replaced by a reference to the nil object?
Again, I’m talking theoretically…

Cheers,
Daniel

dasch · November 25, 2005, 5:15pm

On 11/25/05, Daniel S. [email protected] wrote:

the object with a block, in the style of open(“file”){|f| …}; if
object’s id. I

Cheers,
Daniel

Matz has also said that define_finalizer was in ObjectSpace on purpose
to
discourage the use of finalizers. Finalizers are kind of weird, I would
recommend adding a clean-up method to any object you think “needs” a
finalizer. At least then the object still exists as in in a known state.
Try
to not to think of them as destructors (in C++ sense) because they
aren’t.

dasch · November 25, 2005, 5:31pm

On Sat, 26 Nov 2005, Daniel S. wrote:

you see what I mean.

No, not really. I really can’t stretch enough that I’m a noob…

The braces, curly brackets, are a block. The CS people will tell
you it is a closure. What it means is: In the current scope (local
variables defined, etc) execute open(“file”) to open the file for
reading, and then [when inside open it effectively calls yield] you
will have a parameter which we’ll call f, and we’ll do something
with it in the local scope until the closing brace of the block. On
the closing brace, open will close the file for us. The oft’ used
each method behaves in a similar way.

For an intro:
http://www.rubycentral.com/book/tut_containers.html#S2

For info about using them with constructors:
http://www.rubycentral.com/articles/insteval.html

    [...]

imagine the `finalize’ method could be run, and afterwards, the object
could
be replaced with nil, maybe retaining the old object id. That way any
variables that point to the object will be nil.

But there is only one nil in the system, so it can’t have many ids.

Could the object then by replaced by a reference to the nil object? Again, I’m
talking theoretically…

Ruby doesn’t have references, otherwise it could.

Cheers,
Daniel

    Hugh

dasch · November 25, 2005, 5:31pm

Jim W. [email protected] writes:

That’s correct, it prevents problems like the following code:

class Klass
def finalize
$global_reference = self
end
end

What’s the problem if this method ran on the first GC once and then
the object just stayed around? I.e. have a three-pass GC, mark, run,
sweep?

dasch · November 25, 2005, 5:51pm

On Sat, 26 Nov 2005, Daniel S. wrote:

Oh, sorry, I misunderstood you: You mean something like this:

class Klass
def initialize
# setup usual stuff, then:
yield self
finalize
end
def finalize
  # do something

    # like releasing resources, yes.

end
end

Sorry, I’m kinda groggy

Not that groggy, that’s exactly what I meant.

Cheers,
Daniel

    Hugh

dasch · November 25, 2005, 5:51pm

Hugh S. wrote:

the object with a block, in the style of open(“file”){|f| …}; if
each method behaves in a similar way.
Oh, sorry, I misunderstood you: You mean something like this:

class Klass
def initialize
yield self
finalize
end

 def finalize
   # do something
 end

end

Sorry, I’m kinda groggy

Cheers,
Daniel

dasch · November 25, 2005, 6:00pm

chneukirchen wrote:

What’s the problem if this method ran on the first GC once and then
the object just stayed around? I.e. have a three-pass GC, mark, run,
sweep?

There are two problems:

(1) Detecting that the finalizer has created new reference to a object
scheduled for collection is problematic. I’m thinking this would
require a new mark pass to detect.

(2) What is the state of an object that has been finalized, but sticks
around afterwards? Is it valid to call methods on it? After all,
supposedly all of its resources have been finalized at this point. And
when it becomes eligible for collection again, does the finalizer need
to run again or not?

Most languages that allow finalizers explicitly state that creating new
references duing finalization is not allowed and depend upon the
programmer to follow that rule. Ruby handles it by making it impossible
for the programmer to break that rule (i.e. the finalizer doesn’t get a
reference to the object, making it difficult to create a reference to
it).

– Jim W.

dasch · November 25, 2005, 8:01pm

method is done executing, since the Klass instance has already been
deemed ready for destruction.

This would force the interpreter to keep a back reference for all
objects to variables which keep references to them, not very efficient
in regard of cpu and memory. Java solves this problem by reviving those
objects, which seems at least for me very odd.

So generally you should avoid finalizers, because most of the time you
are only interested in running some code before the application exits.

This should help in most cases:

BEGIN { $exit_listeners = [] }

def add_exit_listener(object)
$exit_listeners << object
end

END { $exit_listeners.each {|listener| listener.on_exit } }

Alternatively you may declare an END block inside your initialize
method.

dasch · November 25, 2005, 7:20pm

Jim W. [email protected] writes:

chneukirchen wrote:

What’s the problem if this method ran on the first GC once and then
the object just stayed around? I.e. have a three-pass GC, mark, run,
sweep?

There are two problems:

(1) Detecting that the finalizer has created new reference to a object
scheduled for collection is problematic. I’m thinking this would
require a new mark pass to detect.

No, this is not required by my scheme. Methods which have a finalize
will require two sweeps to actually get removed (that’s not that bad,
the second one will follow soon if the first one wasn’t sucessful).
You need to keep track of the objects whose finalizer has been run,
yeah. I’m sure we can use a bit left over somewhere for that…

(2) What is the state of an object that has been finalized, but sticks
around afterwards? Is it valid to call methods on it? After all,
supposedly all of its resources have been finalized at this point. And
when it becomes eligible for collection again, does the finalizer need
to run again or not?

It is unreclaimed. Yes. Yes. No. Yes.
You can keep track of disposal using an instance variable.

Most languages that allow finalizers explicitly state that creating new
references duing finalization is not allowed and depend upon the
programmer to follow that rule. Ruby handles it by making it impossible
for the programmer to break that rule (i.e. the finalizer doesn’t get a
reference to the object, making it difficult to create a reference to
it).

That is a valid option, but I think Ruby can do better at little cost.

dasch · November 25, 2005, 10:18pm

Hugh S. wrote:

Daniel

    Hugh

What would you call such an approach? Encapsulating? I use this often
enough that I think it would fit lovely in a module:

module Encapsulatable
def self.included(klass)
super
klass.module_eval do
def self.new(*args)
yield obj = new(*args)
obj.finalize if obj.respond_to? :finalize
end
end
end
end

Cheers,
Daniel

dasch · November 26, 2005, 1:25pm

“D” == Daniel S. [email protected] writes:

D> What would you call such an approach? Encapsulating? I use this often
D> enough that I think it would fit lovely in a module:

A bad approach, at least for me.

D> def self.new(*args)
D> yield obj = new(*args)
D> obj.finalize if obj.respond_to? :finalize
D> end

There is a convention in ruby : the method ::new must return an object
of
the class. You must change the name of the method if its return nil, or
another object.

Guy Decoux

dasch · November 26, 2005, 2:33pm

ts wrote:

D> end

There is a convention in ruby : the method ::new must return an object of
the class. You must change the name of the method if its return nil, or
another object.

Roger. That’s not that hard to do:

module Encapsulatable
def self.included(klass)
super
klass.metaclass_eval do
private :new

     def encapsulate(*args)
       yield obj = new(*args)
       obj.finalize if obj.respond_to? :finalize
     end
   end
 end

end

Note that this doesn’t prevent the user from calling methods on the
object after its `finalize’ method has been called.

class Klass
include Encapsulatable

 def initialize
   puts "initializing..."
 end

 def foo
   puts "called method `foo'"
 end

 def finalize
   puts "finalizing..."
 end

end

ref = nil
Klass.encapsulate { |obj| obj.foo; ref = obj }
ref.foo

But then again, this ain’t Java. If you want to, you could wreak havoc
in seconds. But we’re all friends, right?

Cheers,
Daniel

dasch · November 30, 2005, 4:26pm

On 11/30/05, Peter C. Verhage [email protected] wrote:

Hi,

How about making self a WeakRef in the finalize method?
http://www.ruby-doc.org/stdlib/libdoc/weakref/rdoc/

Or just say in the docs that any references to self will become
invalid after finalize is called. I don’t think doing this is any
less safe than using #object_id and ObjectSapce::_id2ref (same type of
error if you use this on an object_id already GCed). Right now if you
are using ObjectSapce::define_finalizer, you are likely using
ObjectSpace::_id2ref somewhere (you probably have some data-structure
holding object_id’s).

Maybe this method could be called #_finalize to say that it isn’t safe
(like ObjectSpace::_id2ref).

dasch · November 30, 2005, 3:45pm

Hi,

How about making self a WeakRef in the finalize method?
http://www.ruby-doc.org/stdlib/libdoc/weakref/rdoc/

Regards,

Peter