Why is overloading invalid in ruby

tedforte · January 29, 2011, 2:45am

Adam P. wrote:

On Fri, Jan 28, 2011 at 5:13 PM, Charles Oliver N.
[email protected]wrote:

And it would be even cooler if Ruby supported some form of type-driven
pattern matching, so you could have different method bodies for
different input types rather than checking them over and over again.
But that’s a post for another day
Not very quackish.

But it’s used all over the core and standard libraries.

I tried implementing my own version of Enumerable#inject, which has no
less than four overloads and is almost impossible to implement in
Ruby because of that:

Enumerable[$A]#inject($B) {|$B, $A| $B } → $B
Enumerable[$A]#inject     {|$A, $A| $A } → $A
Enumerable[$A]#inject($B, Symbol)        → $B
Enumerable[$A]#inject(Symbol)            → $A

(For obvious reasons, Ruby doesn’t have a standardized syntax for
talking about types, so I made one up. I hope you get what I mean.)

For example, the Rubinius implementation of Enumerable#inject contains
about 8 lines of actual code (ignoring comments, blank lines and lines
consisting only of ‘end’ or ‘}’). Two implement the actual behavior of
inject, the other six basically implement an ad hoc,
informally-specified, bug-ridden, slow implementation of half of
argument-based dispatch.

In fact, almost every method in kernel/common/enumerable.rb in
Rubinius contains at least one line that does not actually contribute
anything to the logic of the method but instead changes the behavior
of the method in some way, shape or form based on the number or class
of the arguments.

And I’ve seen that in other code as well, not just Rubinius’s or my
own, and not just in code that tries to replicate stdlib behavior.

Several projects I know chose YARD over RDoc as their documentation
tool, because it allows them to at least document their overloads
easily, even if writing them is cumbersome.

Personally, I would enjoy being able to dispatch on arguments in
addition to receivers. Martin Odersky hinted that he is interested in
adding argument-based dispatch to Scala, it will be interesting to see
what he comes up with, although he also said that it’s a very hard
problem (because of its interactions with overloading) and will take
many years if it happens at all.

Note that I avoided the use of the term “overloading” and used
“argument-based dispatch” instead. I’m pretty sure that’s what the OP
meant, since overloading happens statically and simply doesn’t make
sense in Ruby.

jwm

tedforte · January 28, 2011, 8:54pm

On Fri, Jan 28, 2011 at 11:41 AM, Adam P. [email protected]
wrote:

On Fri, Jan 28, 2011 at 5:13 PM, Charles Oliver N.
[email protected]wrote:

And it would be even cooler if Ruby supported some form of type-driven
pattern matching, so you could have different method bodies for
different input types rather than checking them over and over again.
But that’s a post for another day

Not very quackish.

Neither are case/when statements like this:

case obj
when String …
when Regexp …
end

But sometimes that’s what you need to do.

Charlie

tedforte · January 29, 2011, 7:38am

Whenever the overloading topic comes up, I wonder about what the
future holds regarding true named arguments. Does anyone know?

Specifically, MacRuby has an extension to the usual hash parameters
that seems pretty complicated to me, but allows it to interact fairly
painlessly with Objective-C*. It basically allows you to define a
method with named parameters. It’s bewildering because a method
invocation like “person.setFirstName(‘Gloria’, { :lastName=>‘Alvarez’
})” actually calls a different method from
“person.setFirstName(‘Gloria’, lastName: ‘Alvarez’)” (the former is
the regular Ruby hash argument type; the latter uses the
MacRuby-specific keyed arguments. If no keyed-argument version of a
method exists, it will just translate it into a hash-argument form.)

Anyway, I wonder if there has been any talk between MacRuby and the
other Ruby implementations on getting this sort of keyed arguments
standardized.

Objective-C, like Smalltalk, has methods with multi-part names,
where some of the arguments go inside the method name; e.g. you have a
method beginSheetModalForWindow: modalDelegate: didEndSelector:
contextInfo: where each colon marks a place where an argument should
go. Generally the word (or few words) right before the colon indicate
what the argument is supposed to be, semantically.

tedforte · January 29, 2011, 8:21am

http://rubyworks.github.com/platypus/

tedforte · January 30, 2011, 10:54pm

2011/1/29 Jörg W Mittag [email protected]:

Anyway, I wonder if there has been any talk between MacRuby and the
other Ruby implementations on getting this sort of keyed arguments
standardized.

AFAIK, Laurent S. did consult with matz to make sure that
MacRuby’s extensions would be forward compatible.

I don’t believe that’s the case, and if I remember right Matz actually
expressed concern that MacRuby was adding syntax that might later
conflict with MRI.

So, while there does not exist a design, let alone an implementation
of named arguments for Ruby 2.0, it seems to be clear that whatever
design they come up with, will have to be compatible with MacRuby.

MacRuby has taken the risk of future syntax being incompatible. I
don’t think their decision to add syntax no other Ruby impl supports
should limit future design of Ruby proper.

FWIW, I understand the justification for the MacRuby syntax (objc
interop), but it’s pretty clear to me that adding incompatible syntax
puts MacRuby on its own wrt future standard syntax changes.

I’ve considered adding syntax to JRuby for some things (like to allow
static dispatch against Java objects, for perf) but in every case I’ve
only considered options that would be forward-compatible (like
comment-based annotation of types, etc).

Charlie

tedforte · January 29, 2011, 12:45pm

Eric C. wrote:

Whenever the overloading topic comes up, I wonder about what the
future holds regarding true named arguments. Does anyone know?

AFAIK, they are on the wishlist for Ruby 2.0, but I don’t think there
has been any commitment made. There’s no implementation that I know
of, not even a design.

Anyway, I wonder if there has been any talk between MacRuby and the
other Ruby implementations on getting this sort of keyed arguments
standardized.

AFAIK, Laurent S. did consult with matz to make sure that
MacRuby’s extensions would be forward compatible.

So, while there does not exist a design, let alone an implementation
of named arguments for Ruby 2.0, it seems to be clear that whatever
design they come up with, will have to be compatible with MacRuby.

jwm

tedforte · January 31, 2011, 11:17am

So far I don’t believe I’ve heard anyone make the obvious case against
method overloading – that unless done very carefully indeed, it makes
the code much, much more difficult to read. One method does one job is
the sane way to go, thanks.

Or, indeed, the practical case - at the very heart of Ruby is the idea
of duck typing. Duck typing rules out method overloading, because
parameters would have to have set types before you could have a
signature. Presumably no-one is suggesting that we should have fixed
typing in Ruby?

tedforte · January 30, 2011, 10:57pm

On Sat, Jan 29, 2011 at 12:37 AM, Eric C.
[email protected] wrote:

Specifically, MacRuby has an extension to the usual hash parameters
that seems pretty complicated to me, but allows it to interact fairly
painlessly with Objective-C*. It basically allows you to define a
method with named parameters. It’s bewildering because a method
invocation like “person.setFirstName(‘Gloria’, { :lastName=>‘Alvarez’
})” actually calls a different method from
“person.setFirstName(‘Gloria’, lastName: ‘Alvarez’)” (the former is
the regular Ruby hash argument type; the latter uses the
MacRuby-specific keyed arguments. If no keyed-argument version of a
method exists, it will just translate it into a hash-argument form.)

Laurent can correct me if I’m wrong, but I don’t think ObjC’s syntax
qualifies as true named arguments. In MacRuby/ObjC, a method call like

foo(bar:1, baz:2)

Is just a way of saying something like

foo_with_bar_with_baz(1, 2)

The method name and the argument names actually resolve to different
method bodies internally. In addition, the order is absolutely
crucial. The following two calls are not the same and will not end up
in the same target method:

foo(bar: 1, baz: 2)
foo(baz: 2, bar: 1)

Named arguments would allow you to call the same method with
different groups of names in any order.

Charlie

tedforte · January 31, 2011, 4:00pm

On Jan 31, 2011, at 5:16 AM, Shadowfirebird wrote:

Or, indeed, the practical case - at the very heart of Ruby is the idea of duck
typing. Duck typing rules out method overloading, because parameters would have
to have set types before you could have a signature. Presumably no-one is
suggesting that we should have fixed typing in Ruby?

I think the original poster provided an example of overloading based on
the number of parameters but not their type.
Even restricting yourself to overloading by arity is a bit problematic
in Ruby because the arity still has to be determined (in some cases)
dynamically:

args = [1,2]
foo(*args) # two arguments
args << 3
foo(*args) # three arguments

Gary W.

tedforte · January 31, 2011, 11:47am

On Mon, Jan 31, 2011 at 11:16 AM, Shadowfirebird
[email protected] wrote:

So far I don’t believe I’ve heard anyone make the obvious case against method
overloading – that unless done very carefully indeed, it makes the code much,
much more difficult to read. One method does one job is the sane way to go,
thanks.

I had tried to make the point here:

http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/377075

Or, indeed, the practical case - at the very heart of Ruby is the idea of duck
typing. Duck typing rules out method overloading, because parameters would have to
have set types before you could have a signature. Presumably no-one is suggesting
that we should have fixed typing in Ruby?

Erm, actually there are people who believe typing should change in
Ruby to support static typing features. There does not seem to be
much support for this in the community though. Obviously ducks feel
more at home in our community pond than metal skeletons.

Kind regards

robert

tedforte · January 31, 2011, 8:27pm

On Mon, Jan 31, 2011 at 9:00 AM, Gary W. [email protected] wrote:

args << 3
foo(*args) # three arguments

Actually, arity of callsite is always calculated in Ruby to know if
you should throw an ArgumentError (3 for 0 specified sort of errors)
against the method you are calling. It seems like overloading based
on arity is not such a bad idea to me based on some of the common
arity parsing idioms people do by hand in the first few lines of their
methods. What implementing arity-based overloads would do is get rid
of most of this code we put at the top of methods and perform that
logic in the Ruby implementation itself (in MRI in C vs in Ruby).

Stylistically, I think the biggest issue is not realizing there are n
overloads and then implementing less than n overloads in an overridden
class.

-Tom

tedforte · January 31, 2011, 8:49pm

On Mon, Jan 31, 2011 at 1:27 PM, Thomas E Enebo [email protected]
wrote:

logic in the Ruby implementation itself (in MRI in C vs in Ruby).
I’d also add that in JRuby, if the source arity matches the target
(non-rest, non-optional) arity, we do no calculation at all (for
arities up to 3). So if Ruby supported multiple overloads, it would
basically just work like our core arity-split methods do right now and
automatically route to the correct body.

Stylistically, I think the biggest issue is not realizing there are n
overloads and then implementing less than n overloads in an overridden
class.

An excellent point. This bites overload-supporting static languages
very frequently too.

Charlie

tedforte · January 31, 2011, 6:38pm

Charles Oliver N. wrote:

MacRuby-specific keyed arguments. If no keyed-argument version of a
method exists, it will just translate it into a hash-argument form.)
Laurent can correct me if I’m wrong, but I don’t think ObjC’s syntax
qualifies as true named arguments.

Yes. Objective-C inherits its syntax from Smalltalk, which just has
plain old boring standard positional arguments like every other
language, like C, like Ruby minus optional and splat arguments.

The only peculiarity is that the arguments get written between the
subroutine name instead of at the end. So, if you have a method named
foo:bar:baz: which takes three arguments, the way you would call it in
pretty much every other language is

foo:bar:baz:(1, 2, 3)

whereas in Smalltalk it’s

foo: 1 bar: 2 baz: 3

This allows you to achieve some nice readability with clever method
naming, i.e. instead of

dictionary.add(1, 2) # which one's the key and which is the value?

you get

dictionary at: 1 put: 2.

But fundamentally, these are still positional arguments. I could, for
example, do this in Ruby:

dictionary.at_first_put_second(1, 2)

In MacRuby/ObjC, a method call like

foo(bar:1, baz:2)

Is just a way of saying something like

foo_with_bar_with_baz(1, 2)

More precisely, your example (roughly) translates to the following
snippet of Smalltalk:

temp := Dictionary new.
temp at: #bar put: 1.   'No dictionary literals in Smalltalk'

foo: temp baz: 2.

i.e. in Ruby:

send(:'foo:baz:', { :bar => 1 }, 2)

A better example would be

foo(1, bar: 2, baz: 3)

which translates to

foo: 1 bar: 2 baz: 3.

or

send(:'foo:bar:baz:', 1, 2, 3)

The method name and the argument names actually resolve to different
method bodies internally. In addition, the order is absolutely
crucial. The following two calls are not the same and will not end up
in the same target method:

foo(bar: 1, baz: 2)
foo(baz: 2, bar: 1)

Correct.

The only reason why you can say

condition ifTrue: [doThis] ifFalse: [doThat].
condition ifFalse: [doThat] ifTrue: [doThis].
condition ifTrue: [doThis]; ifFalse: [doThat].

is because TrueClass and FalseClass define four methods

ifTrue:ifFalse:
ifFalse:ifTrue:
ifTrue:
ifFalse:

With named arguments (and optional arguments), you would have just one
method:

def if(then=->{}, else=->{})

jwm

tedforte · February 1, 2011, 4:19pm

Gary W. wrote in post #978791:

On Jan 31, 2011, at 2:47 PM, Charles Oliver N. wrote:

On Mon, Jan 31, 2011 at 1:27 PM, Thomas E Enebo [email protected] wrote:

On Mon, Jan 31, 2011 at 9:00 AM, Gary W. [email protected] wrote:

snip

Seems like you can get pretty far though with just a little
meta-programming with no special language support:

snip

That’s neat. Thanks!

tedforte · February 1, 2011, 4:36pm

On Mon, Jan 31, 2011 at 9:10 PM, Gary W. [email protected] wrote:

Maybe I’m overlooking something but I wasn’t suggesting that you don’t need to
calculate arity but instead was pointing out that you can’t just calculate it at
parse time but sometimes need to calculate it at call time.

foo(1,2) # the call arity can be computed at parse time
foo(*a) # the call arity must be computed at call time

So in an ‘overload-based-on-arity’ scheme there might be some optimization
opportunities for some call sites but not for every call site.

Yeah, that is certainly true in the general sense even in existing
Ruby semantics. In your first example, we know that it is always a
two-arg call (at parse time) so we can go through a two-arity call
path and not ‘box’ the call parameters into an array. In the second
case we might be able to know arity at parse time if we know ‘a’ is a
literal array. If we don’t then the optimizations we can do get
more limited.

You are correct that adding arity to the mix would make the
optimizations more complex, but I think it depends on the case and
what you get by supporting it. For example, in the second case if we
made our callsite cache lookup the methods foo and cache all of them
at the site, then dispatch to the appropriate one, we would have quite
a bit more complicated callsite cache (since we would need to
invalidate it if any arity version changed), but this would dispatch
much faster than doing things like you are showing in the example
later in your email (actually much faster than any pure-Ruby logic for
arity resolution). In the case where there was only one arity it
would behave more or less like it does currently. The main change
would be on any same-named method we would need to invalidate. [Note:
This is just one way this could be done and callsite invalidation
would be about the same as what it is now since it would invalidate
based on name. We could do it name+arity. We could do it both ways
even (perhaps name+arity for first case and only name for second).

With all this said, it seems like a good idea to me, but OTOH no
feature is without its caveats. I am more worried about Ruby
programming style than performance and though it seems to pass a smell
test for me…I don’t think I am in the ‘yeah let’s do it camp’ yet.
I am probably in the ‘someone should play with this’ camp.

tedforte · February 1, 2011, 4:10am

On Jan 31, 2011, at 2:47 PM, Charles Oliver N. wrote:

On Mon, Jan 31, 2011 at 1:27 PM, Thomas E Enebo [email protected] wrote:

On Mon, Jan 31, 2011 at 9:00 AM, Gary W. [email protected] wrote:

I think the original poster provided an example of overloading based on the
number of parameters but not their type.
Even restricting yourself to overloading by arity is a bit problematic in Ruby
because the arity still has to be determined (in some cases) dynamically:
Actually, arity of callsite is always calculated in Ruby to know if
you should throw an ArgumentError (3 for 0 specified sort of errors)

Maybe I’m overlooking something but I wasn’t suggesting that you don’t
need to calculate arity but instead was pointing out that you can’t just
calculate it at parse time but sometimes need to calculate it at call
time.

foo(1,2) # the call arity can be computed at parse time
foo(*a) # the call arity must be computed at call time

So in an ‘overload-based-on-arity’ scheme there might be some
optimization opportunities for some call sites but not for every call
site.

Seems like you can get pretty far though with just a little
meta-programming with no special language support:

module Arity
def overload(base)
define_method(base) do |*args|
argc = args.size
method_name = “#{base}_#{argc}”
if respond_to?(method_name)
send(method_name, *args)
else
raise ArgumentError, “wrong number of arguments (no method for
#{argc} arguments)”
end
end
end
end

class A
extend Arity

overload :foo

def foo_0
puts “foo with no arguments”
end

def foo_1(arg1)
puts “foo with 1 argument: #{arg1.inspect}”
end

end

A.new.foo # dispatches to A#foo_0
A.new.foo(1) # dispatches to A#foo_1
A.new.foo(1,2) # ArgumentError

Gary W.