Ruby Forum Ruby-core > Virtual classes and 'real' classes -- why?

Posted by John Lam (CLR) (Guest)
on 28.03.2007 21:45
(Received via mailing list)
I was wondering if someone could help me understand why there's a 
parallel class hierarchy - there's the 'real' class objects which holds 
onto instance methods, and the 'virtual' class object that holds onto 
class methods.

I can see how this design can make method lookup somewhat faster since 
it avoids having to do a test and branch operation before method 
dispatch.

But is there some other non-obvious corner case that drove this design? 
In other words, if there were an implementation that had a dictionary 
for instance methods and class methods on the same class object, would 
this break something?

Thanks,
-John
Posted by MenTaLguY (Guest)
on 28.03.2007 22:43
(Received via mailing list)
On Thu, 29 Mar 2007 04:44:16 +0900, "John Lam (CLR)" 
<jflam@microsoft.com> wrote:
> I was wondering if someone could help me understand why there's a parallel
> class hierarchy - there's the 'real' class objects which holds onto
> instance methods, and the 'virtual' class object that holds onto class
> methods.

Every (non-value-typed) object in Ruby has an accompanying "virtual" 
class, permitting "one-off" ("singleton") methods to be attached to it. 
As calls to "class methods" are simply regular method calls that happen 
to have a class as the receiver, it was probably simplest not to make 
classes a special case in this regard.

These virtual classes are in fact visible to Ruby code, are most 
commonly referred to as "singleton classes" or (more descriptively) 
"eigenclasses", and can be extracted via a construct like:

 class Object
   def eigenclass
     class << self
       self
     end
   end
 end

(one would then call obj.eigenclass to get obj's eigenclass)

There is one slight difference in method lookup between class objects 
and other objects, though.  While for most objects, the search order 
could be obtained via:

 class Object
   def method_search_order
     [ self.eigenclass ] + self.class.ancestors
   end
 end

...classes also include their ancestor classes' "eigenclasses" in the 
search:

 class Class
   def method_search_order
     ancestors.select { |m| Class === m }.map { |c| c.eigenclass } + 
self.class.ancestors
   end
 end

-mental
Posted by John Lam (CLR) (Guest)
on 29.03.2007 01:31
(Received via mailing list)
Thanks for sharing the eigenclass hack.

More interestingly though, is where does the new method comes from when 
doing A.new. We believe that once we've looked at all of the ancestor 
classes' eigenclasses for new, that we then hit the Class and Module 
classes in the list of eigenclasses. Does that make sense as well? It's 
a bit strange because you start off looking through the eigenclass 
hierarchy for new, and then switch to the class hierarchy at the very 
end for Class, Module and Object.

Is this how things are actually implemented? (BTW I'm not lazy here - we 
cannot look for ourselves).

Also, as long as we respect this lookup mechanism, does anyone forsee 
any problems in merging the identities of the eigenclass and class 
objects into the same object (other folks who have code that explicitly 
depends on the behavior of the eigenclass hack -- not sure what this 
really means at this time though).

Thanks,
-John
Posted by Gary Wright (Guest)
on 29.03.2007 04:52
(Received via mailing list)
On Mar 28, 2007, at 4:42 PM, MenTaLguY wrote:
> ...classes also include their ancestor classes' "eigenclasses" in  
> the search:
>
>  class Class
>    def method_search_order
>      ancestors.select { |m| Class === m }.map { |c| c.eigenclass }  
> + self.class.ancestors
>    end
>  end


I noticed that in 1.9 class eigenclasses are arranged in a superclass
hierarchy but
this is not reflected in the ancestors array.  Is this intentional?

class Object
   def eigenclass
     (class <<self; self; end)
   end
end

p Array.eigenclass.superclass    # #<Class:Object>
p Array.eigenclass.ancestors     # [Class, Module, Object, Kernel,
BasicObject]

p Class.eigenclass.ancestors     # [Class, Module, Object, Kernel,
BasicObject]

klass = Class.eigenclass
while klass
   p klass
   klass = klass.superclass
end

# output from loop

#<Class:Class>
#<Class:Module>
#<Class:Object>
#<Class:BasicObject>
Class
Module
Object
BasicObject
TypeError: uninitialized class

It seems that BasicObject doesn't respond to #superclass.
Posted by Charles Oliver Nutter (Guest)
on 29.03.2007 07:07
(Received via mailing list)
John Lam (CLR) wrote:
> Thanks for sharing the eigenclass hack.
> 
> More interestingly though, is where does the new method comes from when doing A.new. We believe that once we've looked at all of the ancestor classes' eigenclasses for new, that we then hit the Class and Module classes in the list of eigenclasses. Does that make sense as well? It's a bit strange because you start off looking through the eigenclass hierarchy for new, and then switch to the class hierarchy at the very end for Class, Module and Object.
> 
> Is this how things are actually implemented? (BTW I'm not lazy here - we cannot look for ourselves).
> 
> Also, as long as we respect this lookup mechanism, does anyone forsee any problems in merging the identities of the eigenclass and class objects into the same object (other folks who have code that explicitly depends on the behavior of the eigenclass hack -- not sure what this really means at this time though).

Don't get confused now...class methods are not defined on the
"eigenclass", they're instance methods on the class object. The class
object is often also referred to as the object's metaclass, though this
terminology can be confusing.

So if you're calling Array.new, it's looking at the Array class object
to call one of its instance methods...which are in turn located in a
dictionary on Array's metaclass.

Now I believe you could simply have two dictionaries on Array and you
would support the appropriate lookup semantics for class vs instance
methods. But would you be able to support all the following scenarios?

class << Array; def foo; "hello"; end; end; Array.hello => "hello"

def Array.foo; "hello"; end; Array.hello => "foo"

class Array
   class << self
     def foo
       "hello"
     end
   end

   def self.bar
     "goodbye"
   end
end
Array.foo => "hello"
Array.bar => "goodbye"

This is just a sampling of metaclass weirdness...when you get into
Class.new(...) tricks and marshalling, it gets more complicated.

And trust me, you *do* need to support all of them if you want
nontrivial apps to run.

- Charlie
Posted by Charles Oliver Nutter (Guest)
on 29.03.2007 07:10
(Received via mailing list)
Charles Oliver Nutter wrote:
>>
> "eigenclass", they're instance methods on the class object. The class 
> object is often also referred to as the object's metaclass, though this 
> terminology can be confusing.

Well, let me alter that a bit...apparently there are a few definitions
of eigenclass floating around, that are sometimes compatible and
sometimes incompatible with metaclass. If the code below is expected to
return the eigenclass, then the "Array metaclass" in my previous mail
would be the "Array eigenclass"...

class Array
   def eigen
     class << self; self; end
   end
end

So I believe what I've traditionally called "the class's metaclass" is
now being referred to as "the class's eigenclass".

*sigh*

- Charlie
Posted by MenTaLguY (Guest)
on 29.03.2007 07:10
(Received via mailing list)
On Thu, 2007-03-29 at 08:30 +0900, John Lam (CLR) wrote:
> More interestingly though, is where does the new method comes from
> when doing A.new. We believe that once we've looked at all of the
> ancestor classes' eigenclasses for new, that we then hit the Class and
> Module classes in the list of eigenclasses. Does that make sense as
> well? It's a bit strange because you start off looking through the
> eigenclass hierarchy for new, and then switch to the class hierarchy
> at the very end for Class, Module and Object.

I don't see it being particularly strange -- once we exhaust the
applicable eigenclasses, we start looking at the class hierarchy for the
class's _class_ (Class), just as we do for any other object.  The only
difference is that for class objects we consider more than just the
object's own eigenclass, including also the class's ancestors (not the
class's class's ancestors).

> Is this how things are actually implemented? (BTW I'm not lazy here -
> we cannot look for ourselves).

More or less.

> Also, as long as we respect this lookup mechanism, does anyone forsee
> any problems in merging the identities of the eigenclass and class
> objects into the same object (other folks who have code that
> explicitly depends on the behavior of the eigenclass hack -- not sure
> what this really means at this time though).

The class << self ; self ; end idiom is somewhat heavily employed in
Ruby metaprogramming, so I'd recommend against breaking it.

However, it doesn't really matter how you implement things, so long as
class << obj always gets you a context where self is an object which
behaves like the "virtual class" of obj.  There's no reason it couldn't
be e.g. some sort of lazily created proxy object rather than an actual
virtual class.  But I think always having explicit virtual classes will
give you a cleaner implementation.

Note that virtual classes also have their own virtual classes (ad
infinitum...); this works because Ruby creates virtual classes lazily,
deferring creation until the first singleton method is added to an
object.

-mental
Posted by Charles Oliver Nutter (Guest)
on 29.03.2007 08:02
(Received via mailing list)
MenTaLguY wrote:
> applicable eigenclasses, we start looking at the class hierarchy for the
> class's _class_ (Class), just as we do for any other object.  The only
> difference is that for class objects we consider more than just the
> object's own eigenclass, including also the class's ancestors (not the
> class's class's ancestors).
> 
>> Is this how things are actually implemented? (BTW I'm not lazy here -
>> we cannot look for ourselves).
> 
> More or less.

Ditto in JRuby. It wasn't always this way, but by pragmatically fixing
what broke as we ran more and more code, we've arrived at basically the
same place. And then recently a community member went through and tied
up all the loose ends. A great amount of work on the
metaclass/class/eigenclass/singleton class magic was required to get
Rails running, for example, and it's employed heavily in a number of
Ruby's own standard libraries.

- Charlie
Posted by John Lam (CLR) (Guest)
on 29.03.2007 18:34
(Received via mailing list)
>      def foo
>        "hello"
>      end
>    end
>
>    def self.bar
>      "goodbye"
>    end
> end
> Array.foo => "hello"
> Array.bar => "goodbye"

I believe that a single class object that contains an instance method 
dictionary and a class method dictionary should handle all of these 
cases. The implementation is a bit trickier though. Thanks for sending 
along these scenarios though, it helps me think through these cases in 
more detail.

> This is just a sampling of metaclass weirdness...when you get into
> Class.new(...) tricks and marshalling, it gets more complicated.

If you have any more scenarios of strange behavior that may force the 
parallel class object and virtual class object hierarchy, please pass 
them along. If I understand you correctly, JRuby essentially keeps the 
same object model for classes as MRI?

Thanks,
-John
Posted by Charles Oliver Nutter (Guest)
on 30.03.2007 03:30
(Received via mailing list)
John Lam (CLR) wrote:
> I believe that a single class object that contains an instance method dictionary and a class method dictionary should handle all of these cases. The implementation is a bit trickier though. Thanks for sending along these scenarios though, it helps me think through these cases in more detail.

I will try to think of a scenario that would break your proposed model.
Not to prove you wrong, but to prove to myself I understand what's "how
it works" and to help you avoid heading down a wrong path too early. At
the moment, nothing else comes to mind.

> If you have any more scenarios of strange behavior that may force the parallel class object and virtual class object hierarchy, please pass them along. If I understand you correctly, JRuby essentially keeps the same object model for classes as MRI?

Yes, in theory. In practice, we have to fit what works well in Java, but
the end structure is pretty much the same.

- Charlie
Posted by Alexey Verkhovsky (Guest)
on 30.03.2007 06:26
(Received via mailing list)
On 3/28/07, John Lam (CLR) <jflam@microsoft.com> wrote:
>
> Is this how things are actually implemented? (BTW I'm not lazy here - we
> cannot look for ourselves).


Sorry for being curious. Does this phrase mean that you are 
reimplementing
Ruby (a complex language with no formal specification), and you are not
allowed to look at the source code of the "reference implementation"?
Posted by John Lam (CLR) (Guest)
on 30.03.2007 18:44
(Received via mailing list)
Now that would be crazy now, wouldn't it? J

From: Alexey Verkhovsky [mailto:alexey.verkhovsky@gmail.com]
Sent: Thursday, March 29, 2007 9:26 PM
To: ruby-core@ruby-lang.org
Subject: Re: Virtual classes and 'real' classes -- why?

On 3/28/07, John Lam (CLR) 
<jflam@microsoft.com<mailto:jflam@microsoft.com>> wrote:
Is this how things are actually implemented? (BTW I'm not lazy here - we 
cannot look for ourselves).

Sorry for being curious. Does this phrase mean that you are 
reimplementing Ruby (a complex language with no formal specification), 
and you are not allowed to look at the source code of the "reference 
implementation"?
Posted by Ola Bini (Guest)
on 30.03.2007 18:53
(Received via mailing list)
John Lam (CLR) wrote:
> Now that would be crazy now, wouldn't it? J
>
>   

Is that a yes or a no? =P

--
 Ola Bini (http://ola-bini.blogspot.com)
 JvYAML, RbYAML, JRuby and Jatha contributor
 System Developer, Karolinska Institutet (http://www.ki.se)
 OLogix Consulting (http://www.ologix.com)

 "Yields falsehood when quined" yields falsehood when quined.