#include is not working when re-opening the module

Detlef_R · March 14, 2014, 3:15pm

module Bar ;end
class A; include Bar ;end
module Test
def talk
p “hello”
end
end
module Bar
include Test
end

ob = A.new
A.talk

hello.rb:13:in `’: undefined method talk’ for A:Class

(NoMethodError)

my-ruby · March 14, 2014, 3:47pm

Yes, this is a leak of the implementation.

Problem is a class linearizes its ancestors internally. When you include
a
module in a class, the module itself and recursively all its ancestors
become ancestors of the class in a flat list. (Proxies to them
actually.)

That way, when a method is resolved MRI only follows super pointers, it
does not traverse the actual tree of ancestors at runtime.

In addition to that, modules do not keep track of the places where they
have been included. So when you reopen a module as in your example, MRI
is
not able to go to all existing ancestor chains to update them. Those
chains
become kinda a stale cache.

Charles Nutter showed a while back that JRuby is ready to implement
those
semantics, and Matz told me he would be willling to revise it in MRI as
long as there was no impact in performance.

my-ruby · March 14, 2014, 4:01pm

Xavier N. wrote in post #1139855:

Yes, this is a leak of the implementation.

It means re-opening a module can be risky. What would be the possible
work-around ?

That way, when a method is resolved MRI only follows super pointers, it
does not traverse the actual tree of ancestors at runtime.

super pointers means ?

In addition to that, modules do not keep track of the places where they
have been included. So when you reopen a module as in your example, MRI
is
not able to go to all existing ancestor chains to update them. Those
chains
become kinda a stale cache.

Thanks for sharing this information.

Charles Nutter showed a while back that JRuby is ready to implement
those
semantics, and Matz told me he would be willling to revise it in MRI as
long as there was no impact in performance.

my-ruby · March 14, 2014, 4:18pm

Note : In my original post A.talk is a typo. It should be
ob.talk.

But if I reopen a module and add methods into it, those are OK. Only the
problem belongs to the newly included module.

module Bar ;end
class A; include Bar ;end
module Test
def talk
p “hello”
end
end
module Bar
include Test
def quack ; p 12 ;end
end

ob = A.new
ob.quack # => 12 # works as expected.
ob.talk
#<main>': undefined methodtalk’ for #<A:0x1963cc8> (NoMethodError)

my-ruby · March 14, 2014, 5:48pm

On Fri, Mar 14, 2014 at 4:01 PM, Arup R. [email protected]
wrote:

Xavier N. wrote in post #1139855:

Yes, this is a leak of the implementation.

It means re-opening a module can be risky. What would be the possible
work-around ?

Reopening a module to include another module doesn’t work well with
the
semantics of the language as you are discovering. I learned that the
hard
way extracting prototype-rails from Rails back in the day, and after
hitting my head against a wall a few times at something that didn’t work
as
expected.

That way, when a method is resolved MRI only follows super pointers, it

does not traverse the actual tree of ancestors at runtime.

super pointers means ?

Let me explain. Let’s say we have these modules:

module N
  def x
    :x
  end
end

module M
  include N
end

With those definitions N is an ancestor of M, right? Now let’s define

class C
  include M
end

When you invoke #x on an instance of C:

C.new.x

after checking C itself conceptually the method dispatch algorithm
looks
into the first ancestor, M, fails, then it recurses in the ancestors
of
M, N. Found, dispatch.

If we add a new module reopening M:

module O
  def y
    :y
  end
end

# reopen, assume M, N and C exist as above
module M
  include O
end

the same algorithm should in theory be able to dispatch C.new.y. When
recursing in M now O would show up. But it actually does not work that
way,
and it does not because of the implementation, not because it shouldn’t.

The problem is that when class C was defined, the ancestor chains of the
included modules were flattened, resulting in a flat list:

C.ancestors # => [C, M, N, Object, Kernel, BasicObject]

See N there? Not only the directly included modules are present, but
they
are unfolded. Well, that list can be actually found in the
implementation
of MRI. You are not just recursing and unfolding the tree when ancestors
is
called, the flat list is stored as is.

If you reopen C to include another module, the list gets updated, but if
you reopen M as we did, O is not injected in the list of C. In MRI M has
no
idea it was included in C indeed.

Let’s go for “super pointers” now.

Following super pointers means that the elements of the list are
actually
chained by a “super” reference, the chain is not an array but a linked
list
so to speak.

That is, “super” of C is M, “super” of M is N, “super” of N is Object,
etc.

Since the “super” of (for example) “N” depends on the ancestor chain it
is
included (a different class D including M could have ancestors between N
and Object) MRI actually has an indirection, the chain consists of proxy
modules that have a reference to the original module (for dispatching
stuff), and its own “super” reference to its parent in that particular
chain.

my-ruby · March 14, 2014, 7:36pm

On Fri, Mar 14, 2014 at 7:16 PM, Arup R. [email protected]
wrote:

Xavier N. wrote in post #1139863:

stuff), and its own “super” reference to its parent in that particular
chain.

I think, some really important things you shared, which I had never
read. For me this went to never ending recursions . Can you
explain the above part a bit more ?

Sure.

Let’s consider these three modules:

module Y
end

module X
  include Y
end

module Z
end

And these classes:

class C
  include X
end
C.ancestors # => [C, X, Y, Object, Kernel, BasicObject]

class D
  include Z
  include X
end
D.ancestors # => [D, X, Y, Z, Object, Kernel, BasicObject]

So, in the ancestor chain of C “super” of Y is Object, while in D
“super”
of Y is Z. How’s that possible? There is only one Z and one “super” slot
in
Z!

Thing is MRI installs proxies in the ancestor chain instead of the
actual
modules. That is implementation, it is hidden to the programmer.

So, when a method is dispatched the super pointer takes the algorithm to
a
proxy module that checks the method in the actual, real module he is
proxying. If that module does not have the method, the super pointer of
the
proxy takes the algorithm to the upper proxy, and so on.

Nevertheless, this is implementation, the key observation in this thread
is
that those ancestor chains are not updated if ancestors change in
included
modules (and beyond).

Does that answer your question?

my-ruby · March 14, 2014, 7:41pm

On Fri, Mar 14, 2014 at 7:35 PM, Xavier N. [email protected] wrote:

So, in the ancestor chain of C “super” of Y is Object, while in D
“super”

of Y is Z. How’s that possible? There is only one Z and one “super” slot in
Z!

s/Z/Y/g in that last sentence.

my-ruby · March 14, 2014, 7:16pm

Xavier N. wrote in post #1139863:

On Fri, Mar 14, 2014 at 4:01 PM, Arup R. [email protected]

Well explained as usual. Thank you very much for such detailed
explanation.

Since the “super” of (for example) “N” depends on the ancestor chain it
is
included (a different class D including M could have ancestors between N
and Object) MRI actually has an indirection, the chain consists of proxy
modules that have a reference to the original module (for dispatching
stuff), and its own “super” reference to its parent in that particular
chain.

I think, some really important things you shared, which I had never
read/found/noticed anywhere. For me this went to never ending
recursions . Can you
explain the above part a bit more ?

my-ruby · March 14, 2014, 10:47pm

On Fri, Mar 14, 2014 at 1:46 PM, Xavier N. [email protected] wrote:

semantics of the language as you are discovering. I learned that the hard
way extracting prototype-rails from Rails back in the day, and after hitting
my head against a wall a few times at something that didn’t work as
expected.

Hi Xavier,

Great explanation!

As a “work-around”, based on your explanation, I tried the bellow…

Reopening the class A and “reincluding” the module.

class A
include Bar
end

And everything related to Bar (not above) is “refreshed”.

Abinoam Jr.

my-ruby · March 14, 2014, 8:41pm

Xavier N. wrote in post #1139879: