Dynamically extending modules once they have been included


#1

It doesn’t seem possible to dynamically extend modules once they have
already been included into a particular class. By dynamically extend,
I mean to include a module within another module. Directly opening up
the class and adding methods works fine. See this example:
http://pastie.org/418192 And also, if you “re-include” the original
module into your classes it will pick up the extension module’s
methods.

So why doesn’t including a module into another module affect classes
that have already included that first module?

The only alternative I’ve find is to loop through the ObjectSpace
(filtering on classes) and find out which class’s ancestors include
the module I want to extend and then include my extension module
straight into the class.

Thanks for any help/understanding!


#2

On Mar 16, 2009, at 17:12 , removed_email_address@domain.invalid wrote:

So why doesn’t including a module into another module affect classes
that have already included that first module?

because including a module affects the ancestors, not the class/
object’s method dictionary:


#3

I get it now…the ancestors list of the module is copied over to the
including class at “include-time”. If the module’s ancestry is
changed, this will not be reflected. This seems to me to inhibit
some “meta-programmability”(if that’s a word) as you can’t mixin
modules to already included modules and have the effect take place
with whomever has already included the module. In other words, its
not consistent with the ability to dynamically change(add/change/
delete) methods of the module directly at runtime.

Has no one else encountered/wrestled with this quirk? Seems to me a
great way to organize code - a mixin for a mixin.

class X; include M; end
=> X

X.ancestors
=> [X, M, Object, Kernel]


#4

2009/3/17 Synth removed_email_address@domain.invalid

great way to organize code - a mixin for a mixin.
This is very surprising, I always thought the ancestor tree was
inspected at
method call time, and I’ve done a fair amount of poking of Ruby’s object
system as I implemented it in JavaScript a while ago. I can only assume
this
is a performance hack since walking the ancestor tree is expensive and
you
sure don’t want to be doing it for every method call, so it seems it’s
cached and refreshed when #include is called. Though, it’s only
refreshed
from the module you’ve included, note the absence of K in the final
line:

module M; end
=> nil

class C; include M; end
=> C

C.ancestors
=> [C, M, Object, Kernel, BasicObject]

module K; end
=> nil

M.send :include, K
=> M

C.ancestors
=> [C, M, Object, Kernel, BasicObject]

module Z; end
=> nil

C.send :include, Z
=> C

C.ancestors
=> [C, Z, M, Object, Kernel, BasicObject]

IMO this could be done better and preserve a bit more dynamism. In my
JavaScript implementation, #include fully refreshes the method table for
a
class (it would include K in the above example) and caches included
methods
on the class itself to avoid tree lookups at method call time. If you
call
super(), the ancestor tree is always traversed to find ancestor methods,
which is pretty expensive and makes super() much slower than normal
method
calls.

Would be interested to hear other people’s thoughts on this. It’s kind
of an
edge case, but my opinion is that this behaviour is contrary to much of
Ruby’s dynamism, and is inconsistent: if you add methods to M, they
become
available to C, so why not new ancestors? Also, it means the ancestry
tree
looks different depending on who you ask. Having said that, experience
tells
me that “fixing” it would introduce a serious performance overhead for
the
whole language.


#5

On 17.03.2009 11:11, James C. wrote:

not consistent with the ability to dynamically change(add/change/
delete) methods of the module directly at runtime.

This is very surprising, I always thought the ancestor tree was inspected at
method call time,

That’s true, but the point made was that it is built on inclusion
time.

IMO this could be done better and preserve a bit more dynamism.

Would be interested to hear other people’s thoughts on this. It’s kind of an
edge case, but my opinion is that this behaviour is contrary to much of
Ruby’s dynamism, and is inconsistent: if you add methods to M, they become
available to C, so why not new ancestors? Also, it means the ancestry tree
looks different depending on who you ask. Having said that, experience tells
me that “fixing” it would introduce a serious performance overhead for the
whole language.

More than that: existing code may be broken. There are good reasons not
to “fix” this, because otherwise a change of a module has a side effect
on a totally different class. I cannot remember having seen this
discussed in the last years (which does not necessarily mean something)
but I doubt that many people see this as limitation.

My question for Pete would be: why do you need this and do you need this
frequently? Maybe there is a design pattern issue - i.e. a programming
problem that can be solved differently - maybe even better.

Btw, you can create a “fix” with some metaprogramming. With methods
Module#included and Class#inherited it should be possible to reinclude
the changed module in all classes that have it included.

Given that, the infrequency of this issue surfacing and the unknown risk
of change I vote for “no change”.

Kind regards

robert


#6

This is very surprising, I always thought the ancestor tree was inspected

at
method call time,

That’s true, but the point made was that it is built on inclusion time.

Sorry, I should have been more clear. By using the word ‘tree’ I was
implying that the whole ancestor tree is walked at method call time.
What
actually happens is that it is walked and flattened to an array at
include
time, and this flat array is used to look up methods at call time. At
least,
this is what it looks like – I’ve no idea what the underlying VM is
doing.

tells
Given that, the infrequency of this issue surfacing and the unknown risk of
change I vote for “no change”.

For sure, the Ruby ecosystem is too big to know how changing this might
affect existing code. I’m really asking because I want to know which
option
people consider more elegant, and also because I maintain my own
Ruby-in-JavaScript object system which has nothing like Ruby’s user
base, so
I’m more free to tinker with it.

By the way, I just noticed this:

module M; end
=> nil

class A; end
=> nil

class B < A; end
=> nil

B.ancestors
=> [B, A, Object, Kernel]

A.send :include, M
=> A

B.ancestors
=> [B, A, M, Object, Kernel]

That is, adding a module to an ancestor class affects the descendants.
This
could be seen as an inconsistency in Ruby’s design, depending on how you
think about classes and modules and how they are implemented. In my
implementation everything inheritance-related is handled using modules
so I
would need to add a bunch of special cases to handle some of the
asymmetries
pointed out here.


#7

On 17.03.2009 12:33, James C. wrote:

A.send :include, M
=> A

B.ancestors
=> [B, A, M, Object, Kernel]

That is, adding a module to an ancestor class affects the descendants. This
could be seen as an inconsistency in Ruby’s design, depending on how you
think about classes and modules and how they are implemented. In my
implementation everything inheritance-related is handled using modules so I
would need to add a bunch of special cases to handle some of the asymmetries
pointed out here.

Good point! Still the question remains, how often is this used and who
would benefit from that change (and what is the price)?

Cheers

robert


#8

On Mar 16, 8:12 pm, removed_email_address@domain.invalid wrote:

It doesn’t seem possible to dynamically extend modules once they have
already been included into a particular class. By dynamically extend,
I mean to include a module within another module.

This is know as the Module Include Problem (and variations there-of)
and has been around for, well, ever. The issue is not that “fixing”
it wouldn’t be a good thing, it’s just that implementation of a fix
has proven to be too problematic.

T.


#9

In my case, I suppose, I was being a bit lazy, but I could say I was
being extensible too. I didn’t really need the dynamism that this use
case points to. To be specific, I am building an extension to the
Runt Gem. Runt is a ruby gem to model Martin F.'s Temporal
Expression design pattern. The way its currently coded is that you
have concrete expression classes like DIWeek(Day in Week) or REYear
(Range Each Year) and some of these classes mixes in a TExpr module.
The set of expression classes is finite. In my extension to Runt, I
need to add some behavior to the TExpr module which I do with another
mixin. I didn’t want to have to worry about which classes mixin the
TExpr, I just wanted to modify(aka extend) the TExpr module like you
would any ordinary class and have it be reflected in all classes who
included this module. The alternative(which is what I’m ending up
doing) is to loop through all the classes of Runt and check the
ancestors for TExpr and if so, include my extension module directly
into the class. This approach seems yucky to me and not in the spirit
of ruby and really OO design.

You can clearly argue with me whether you think extending a module
with another is appropriate or not in the above case. However, the
issue remains that this behavior is not consistent with the rest of
ruby. I point again to the fact that you can change behavior of a
module directly and it will be reflected by all included classes -
however NOT if you change behavior via including another module.

Also, I don’t necessarily(keyword here) agree with the philosophical
point of needing ‘a lot’ of use cases to implement a feature. This is
a contentious issue, I know, but I think the ability to dynamically
alter code at runtime speaks for itself. Then combine this with the
fact that you can modularize this dynamic code, it seems common sense
to me. I mean, that’s the beauty of ruby and the whole point of
mixins. You can bundle up code that you can inject into objects at
runtime. But why can’t you do this if the target of your injection is
another module? I’m definitely surprised this is not more talked about
(and criticized :slight_smile: )

As far as changing this behavior, I’d be curious as to some current
ruby code that would break if the ancestor tree was more dynamic to
accomplish this behavior. I think a weighted number of use cases that
would break should be compared with a weighted number of use cases
that could be enabled(and I personally think “modular
metaprogrammability” has a huge amount of gravity! :slight_smile: )


#10

The set of expression classes is finite. In my extension to Runt, I
need to add some behavior to the TExpr module which I do with another
mixin. I didn’t want to have to worry about which classes mixin the
TExpr, I just wanted to modify(aka extend) the TExpr module like you
would any ordinary class and have it be reflected in all classes who
included this module. The alternative(which is what I’m ending up
doing) is to loop through all the classes of Runt and check the
ancestors for TExpr and if so, include my extension module directly
into the class. This approach seems yucky to me and not in the spirit
of ruby and really OO design.

Yes, this is very yucky and you shouldn’t have to do it. Unfortunately
I
can’t think of any elegant way to do this without inspecting the class
tree
using your method or ObjectSpace (you can’t get the descendants of a
module
directly, so I use ObjectSpace which is kinda expensive).

You can clearly argue with me whether you think extending a module
with another is appropriate or not in the above case. However, the
issue remains that this behavior is not consistent with the rest of
ruby. I point again to the fact that you can change behavior of a
module directly and it will be reflected by all included classes -
however NOT if you change behavior via including another module.

You ought to be able to just say you want to change all the objects of
such-and-such a type, whether that type is a module or a class. You
could of
course put your new methods straight into TExpr, which would work but
could
cause debugging and other monkey-patching issues. It’s up to you whether
you
consider this a problem.

Also, I don’t necessarily(keyword here) agree with the philosophical
point of needing ‘a lot’ of use cases to implement a feature.

I am very much speculating here, but I’d be surprised if this weren’t a
case
of, not implementing something as such, but rather one of removing
special
cases. One way to implement Ruby’s object system is that you construct
the
whole thing out of modules (a module is an object that has one method
table
and zero or more ‘parent’ modules), then things like parent-child
inheritance and singleton classes can be implemented on top of that.
This
ends up being quite elegant and I suspect this is a performance issue.

Far as I know, Ruby finds methods by getting a list of an object’s
ancestors
and extracting all implementations of the method from those modules.
Walking
up a series of parent classes takes linear time so it’s not expensive
for
classes to see stuff added to their parents, but walking and flattening
a
multiple inheritance tree could be exponential time so it makes sense to
cache ancestor trees when things like #include are called, effectively
blocking a class from seeing changes to its mixins’ ancestry.

If anyone knows how Module/Class are really implemented in MRI I’d love
to
know more about it (also: I really need to learn C).


#11

Far as I know, Ruby finds methods by getting a list of an object’s ancestors
and extracting all implementations of the method from those modules. Walking
up a series of parent classes takes linear time so it’s not expensive for
classes to see stuff added to their parents, but walking and flattening a
multiple inheritance tree could be exponential time so it makes sense to
cache ancestor trees when things like #include are called, effectively
blocking a class from seeing changes to its mixins’ ancestry.

Indeed, if every module had pointers to other included modules, I
definitely see how you’d end up with a very expensive object graph to
traverse to find method definitions. Using the wonders of ruby, I
found a (rudimentary)way you can easily implement this though(see
below). Its just a more generalized version of what I have above. If
the only consideration is performance, I don’t think its that much of
a hit(depending on your ObjectSpace), plus when you do any
metaprogramming I don’t think performance is a major concern. In this
case, any performance hit will occur only once at “include-time” and I
can’t really envision injecting modules of code at any great rate -
although maybe this is just my short-sightedness :slight_smile: Anywhere, here’s
the workaround:

class Module;
def included(base)
return unless(base.class == Module)
ObjectSpace.each_object(Class){|o|
next unless o.ancestors.include?(base)
o.send(:include, self)
}
end
end

A simple benchmark on my system with irb and the example taken from
Eigenclass’s page:

require ‘benchmark’
class Module; def included(base); return unless(base.class == Module);ObjectSpace.each_object(Class){|o| next unless o.ancestors.include?(base);o.send(:include, self)};end;end
Benchmark.realtime{ ?> module A; end
class C; include A end
module B; def foo; “B#foo” end end
module A; include B end
class D; include A end
}
=> 0.00185704231262207

not too shabby, no?


#12

On Mar 17, 12:28 pm, trans removed_email_address@domain.invalid wrote:

T.

Thanks T for the language - first google search:
http://eigenclass.org/hiki/The+double+inclusion+problem
They also say this might be fixed in ruby2?? :smiley:


#13

On Tue, Mar 17, 2009 at 1:12 AM, removed_email_address@domain.invalid wrote:

The only alternative I’ve find is to loop through the ObjectSpace
(filtering on classes) and find out which class’s ancestors include
the module I want to extend and then include my extension module
straight into the class.

Below is code taken from my FOSDEM presentation. It tries to tackle
the problem you are describing. It is not well-tested or anything, but
it did pass the examples I used in the presentation. Just execute this
code before any of your module definitions and see if it helps.

Peter

class Module

def included_in
@included_in ||= {}
end

def included(m)
unless included_in[m]
included_in[m] = true
m.update_inclusions
end
end

def extended(m)
included(class << m ; self ; end)
end

def update_inclusions
included_in.each do |m, _|
m.send(:include, self)
m.update_inclusions
end
end

end


#14

2009/3/17 Synth removed_email_address@domain.invalid

although maybe this is just my short-sightedness :slight_smile: Anywhere, here’s
end
Cool. You’ll probably want something to handle extended() as well, for
where
modules have been mixed into singleton classes. Calling
object.extend(module) is effectively the same as:

class << object
include module
end

except that a different callback is called.

For the curious, here’s my implementation of the Ruby object model in
JavaScript:
http://github.com/jcoglan/js.class/tree/master/source/core/

It’s reasonably documented; every module maintains both a list of its
mixins
and a list of its descendants as I need to propagate new methods down
the
inheritance chain. Each class has a module where it stores all its
methods.
Whenever an #include takes place, I run the tree and cache all the
resulting
methods on the class itself; I could probably do this for ancestry as
well
without running into Ruby’s problems, since I can easily get modules to
notify their descendants when they are modified so I can update all the
cached method/ancestor tables.