Oppinions on RCR for dup on immutable classes


#1

I’m about to make this RCR and would like to get some oppinions on it in
order to make my decision.
The questions are:

  1. Should I split this RCR up for a) and b)?
  2. Should I only submit a single one, if so, which one?
  3. Should I submit this RCR at all?
  4. Oppinions, suggestions, ideas… everything welcome :slight_smile:

Abstract:
Change behaviour of dup in immutable classes.

Problem:
dup on immutable classes (NilClass, FalseClass, TrueClass, Fixnum,
Symbol)
raises an Exception.

Proposal:
a) Remove the dup method from NilClass, FalseClass, TrueClass, Fixnum,
Symbol
b) Let dup in NilClass, FalseClass, TrueClass, Fixnum, Symbol return
self

Analysis:
This may be a minor glitch in the ruby API, but with the advent or
Ruby2 it
might be worth changing.
a) should only in rare circumstances break existing code as dup for
the cases
in question already throws an exception and it would only break if the
rescue
would fail due to the changed exception class. It would restore the
behaviour
that a class only implements methods it can actually execute, which
also means
testing via respond_to? is possible.
b) would let the immutability of the classes be an implementation
detail (which
would be consistent with behaviour of other immutable classes like
Time e.g.).
It shouldn’t break existing code as the it is the usual fallback.

Implementation:
a)
[NilClass, FalseClass, TrueClass, Fixnum, Symbol].each { |klass|
klass.send(:undef_method, :dup) { self }
}
b)
[NilClass, FalseClass, TrueClass, Fixnum, Symbol].each { |klass|
klass.send(:define_method, :dup) { self }
}

My regards


#2

On Feb 15, 4:46 pm, Stefan R. removed_email_address@domain.invalid wrote:

  1. Should I split this RCR up for a) and b)?
  2. Should I only submit a single one, if so, which one?

You should only submit one. The discussion of which choice is better
is appropriate on the mailing list, IMO. When you feel that a
consensus has been reached, submit that one as an RCR.

Personally, I would lean towards option (b).

Problem:
dup on immutable classes (NilClass, FalseClass, TrueClass, Fixnum,
Symbol)
raises an Exception.

That’s a statement of fact, but doesn’t explain why it’s a problem,
or (important for an RCR) why it needs to be fixed in the core of the
language.

What is the use case that is prevented by the problem? What does it
make inconvenient? Why should it be changed?


#3

Hi,

In message “Re: Oppinions on RCR for dup on immutable classes”
on Fri, 16 Feb 2007 08:55:12 +0900, “Phrogz” removed_email_address@domain.invalid
writes:

|That’s a statement of fact, but doesn’t explain why it’s a problem,
|or (important for an RCR) why it needs to be fixed in the core of the
|language.
|
|What is the use case that is prevented by the problem? What does it
|make inconvenient? Why should it be changed?

Seconded. It pretty trivial for us core developers to make dup for
immutable objects to return themselves, but I don’t understand why
it is needed. I assume obj.object_id != obj.dup.object_id, and see no
good reason enough to break the assumption.

          matz.

#4

On Fri, 16 Feb 2007, Gregory B. wrote:

I also don’t think something like 5.dup has semantic meaning.

unlike 42.dup, which is patently obivous.

-a


#5

On 2/16/07, Yukihiro M. removed_email_address@domain.invalid wrote:

|make inconvenient? Why should it be changed?

Seconded. It pretty trivial for us core developers to make dup for
immutable objects to return themselves, but I don’t understand why
it is needed. I assume obj.object_id != obj.dup.object_id, and see no
good reason enough to break the assumption.

+1. If i am trying to dup an object, sometimes it’s inconvenient that
it complains,
but most of the time i want to know about it.

I also don’t think something like 5.dup has semantic meaning.


#6

On 2/16/07, Stefan R. removed_email_address@domain.invalid wrote:

|
That’s why I have 2 suggestions. The issue arises when you don’t know
those classes can’t really be duped (ie. return self) or expose that
transparently (not implement dup).

a = 3
=> 3

b = a.dup rescue a
=> 3

b
=> 3

What’s so bad about that?


#7

Yukihiro M. wrote:

Hi,

In message “Re: Oppinions on RCR for dup on immutable classes”
on Fri, 16 Feb 2007 08:55:12 +0900, “Phrogz” removed_email_address@domain.invalid
writes:

|That’s a statement of fact, but doesn’t explain why it’s a problem,
|or (important for an RCR) why it needs to be fixed in the core of the
|language.
|
|What is the use case that is prevented by the problem? What does it
|make inconvenient? Why should it be changed?

Seconded. It pretty trivial for us core developers to make dup for
immutable objects to return themselves, but I don’t understand why
it is needed. I assume obj.object_id != obj.dup.object_id, and see no
good reason enough to break the assumption.

          matz.

That’s why I have 2 suggestions. The issue arises when you don’t know
what you dup.
Under the aspect that obj.object_id should always be ==
obj.dup.object_id the first suggestion
of removing dup from those classes would be more appropriate.
To me it seems inconsequential to implement a method with it only
raising an Exception.
That makes it impossible to check if your object doesn’t actually
implement dup (e.g. via
respond_to?). To me it seems that it should either be an implementation
detail that
those classes can’t really be duped (ie. return self) or expose that
transparently (not implement dup).
That of course is my personal oppinion and I’m posting this on the ML to
see if it is only me :slight_smile:

My regards.


#8

On 2/16/07, Gregory B. removed_email_address@domain.invalid wrote:

When you use #dup not knowing about this complication, never dup a
Fixnum in
testing, but happen to dup one in production - e.g. in a library
released to
the public.

Ruby is always about principle-of-least-surprise and this behavior
really
surprised me when I first read about it non-list (I’ve stayed away from
object copies in MY code, because it all seems so hacky and arcane to
me…
Truns out it was a good idea)


#9

On 2/16/07, Gregory B. removed_email_address@domain.invalid wrote:

|language.

detail that
What’s so bad about that?

I would ask the question: Why do you dup a?
The answer is, I presume, because I will modify the duplicate and I
do not want to modify the original.
ok so let us follow our code

b= a.dup rescue a

b.mutating_method
what will happen now?

Cheers
Robert


#10

Hi –

On Sat, 17 Feb 2007, Phrogz wrote:

Many people, including Matz, pointed out that it’s impossible to make
returning an object that is equivalent to the original. Just because
the new instance happens to be the same doesn’t mean that it’s bad -
as an immutable object, the only way to tell is via object_id, anyhow.

It doesn’t mean it’s bad, but it does mean that it isn’t a duplicate
:slight_smile: At least, I wouldn’t think that dup is the right name for a
method that might actually return the receiver.

David


#11

From: SonOfLilit [mailto:removed_email_address@domain.invalid]
Sent: Friday, February 16, 2007 10:48 AM

Ruby is always about principle-of-least-surprise and this behavior
really surprised me when I first read about it non-list (I’ve stayed
away from object copies in MY code, because it all seems so hacky and
arcane to me… Truns out it was a good idea)

Actually, Ruby is about the principle-of-least-surprise-for-matz.

I used to think POLS applied to everyone when I started with Ruby.
Many people, including Matz, pointed out that it’s impossible to make
things unsurprising to everyone, because people will be surprised by
different things.

Just because something is surprising to you doesn’t mean that it
should be changed in Ruby. You are advised (as was I) not to use POLS
as justification for why something should be changed.

Having said that, I personally would prefer for 3.dup to ‘just work’,
returning an object that is equivalent to the original. Just because
the new instance happens to be the same doesn’t mean that it’s bad -
as an immutable object, the only way to tell is via object_id, anyhow.


#12

On Feb 16, 11:43 am, “Robert D.” removed_email_address@domain.invalid wrote:

I would ask the question: Why do you dup a?
The answer is, I presume, because I will modify the duplicate and I
do not want to modify the original.
ok so let us follow our code

b= a.dup rescue a

b.mutating_method
what will happen now?

I’m not sure what your point is, Robert. Is it:

a) See, there’s no point in duplicating an immutable object, since the
only reason to #dup is so you can mutate the instance later, and
anything you tried to do to mutate it would fail later on.

or is it

b) You should never ever write “a.dup rescue a” because it will give
you strange problems if a might refer to a mutable object that doesn’t
respond to #dup.

?


#13

On Feb 16, 11:53 am, removed_email_address@domain.invalid wrote:

On Sat, 17 Feb 2007, Phrogz wrote:


Having said that, I personally would prefer for 3.dup to ‘just work’,
returning an object that is equivalent to the original. Just because
the new instance happens to be the same doesn’t mean that it’s bad -
as an immutable object, the only way to tell is via object_id, anyhow.

It doesn’t mean it’s bad, but it does mean that it isn’t a duplicate
:slight_smile: At least, I wouldn’t think that dup is the right name for a
method that might actually return the receiver.

My perspective is:
If a duplicate of an immutable object is indistinguishable from the
original, then you would have no idea if #dup was returning an
internal duplicate or the same instance. And you wouldn’t care (except
that you’d probably prefer it to conserve resources internally and
return the same instance).

The determining question for me, then, is:
How indistinguishable would two distinct instances of these immutable
objects be? The only case where code might break is if you had:
b = a.dup
h = { a.object_id => 1, b.object_id=> 2 }
and you were very unhappy if you ended up running over the key. I’ve
never personally used object_id for that purpose (or any other than
debugging), so I’m not sure how likely that is. I just use objects
themselves as hash keys, in which case equivalent instances already
behave the same (even of mutable objects!):

irb(main):001:0> a = []; b=a.dup
=> []
irb(main):002:0> p a.object_id, b.object_id
23327420
23327410
irb(main):003:0> h = {a=>1, b=>2}
=> {[]=>2}

irb(main):013:0> a,b = 1234567890,1234567890
=> [1234567890, 1234567890]
irb(main):014:0> p a.class, a.object_id, b.class, b.object_id
Bignum
23346300
Bignum
23346270
=> nil
irb(main):015:0> h = {a=>1,b=>2}
=> {1234567890=>2}


#14

On Feb 16, 2007, at 1:50 PM, Phrogz wrote:

Having said that, I personally would prefer for 3.dup to ‘just work’,
returning an object that is equivalent to the original. Just because
the new instance happens to be the same doesn’t mean that it’s bad -
as an immutable object, the only way to tell is via object_id, anyhow.

I’ve come across this situation when writing generic code to do a
‘deep copy’. Instead of changing the semantics of #dup, why not have
some other method that means ‘make a copy if you can but if the object
has immutable semantics then return a reference to self’.

As for a name for such a method, how about Kernel#another ?

I’m not sure how #dup, #clone, and #another should be related. Perhaps
#another should use #clone instead of #dup? Maybe there should be
another
version of #another that uses clone semantics?

Gary W.


#15

Hi –

On Sat, 17 Feb 2007, Phrogz wrote:

method that might actually return the receiver.
objects be? The only case where code might break is if you had:
irb(main):002:0> p a.object_id, b.object_id
Bignum
23346270
=> nil
irb(main):015:0> h = {a=>1,b=>2}
=> {1234567890=>2}

You could conceivably have a case where an object can’t be dup’d, but
is mutable, like a singleton class. If dup returns self, then you
could end up changing the object when you didn’t want to. It’s
probably not an everyday problem… but I still don’t like the idea of
having to remember that “dup” means “dup or self”.

David


#16

Hi –

On Sat, 17 Feb 2007, Gary W. wrote:

has immutable semantics then return a reference to self’.

As for a name for such a method, how about Kernel#another ?

But self isn’t other than self.

I’m not sure how #dup, #clone, and #another should be related. Perhaps
#another should use #clone instead of #dup? Maybe there should be another
version of #another that uses clone semantics?

#another_another? :slight_smile:

David


#17

On 2/16/07, Phrogz removed_email_address@domain.invalid wrote:

I’m not sure what your point is, Robert. Is it:

a) See, there’s no point in duplicating an immutable object, since the
only reason to #dup is so you can mutate the instance later, and
anything you tried to do to mutate it would fail later on.
Exactly
but I did not say that it was bad an idiom, it is not an idiom IMHO as
useful as it might look first.
AAMOF I was about to say first: Hey this idiom is great so no RCR is
needed.
And then I thought… - hey that can happen :wink:
and I realized that these are strange cases to call dup to protect an
object from mutation and I want it that for immutable objects on the
fly. I am sure there are usecases, but view.

My conclusion was this is not an idiom worth a CR at all.

or is it

b) You should never ever write “a.dup rescue a” because it will give
you strange problems if a might refer to a mutable object that doesn’t
respond to #dup.
Neve say never :wink: but when you write
a.dup rescue a
you should be aware of that potential danger

a.dup
will not really alert the user enough of that, I feel.

?

Strangely enough I fail to see the difference between a) and b)

Cheers
Robert


#18

On Feb 16, 12:30 pm, removed_email_address@domain.invalid wrote:

You could conceivably have a case where an object can’t be dup’d, but
is mutable, like a singleton class. If dup returns self, then you
could end up changing the object when you didn’t want to. It’s
probably not an everyday problem… but I still don’t like the idea of
having to remember that “dup” means “dup or self”.

Yes, but that would only be a problem if you did:
class Object
alias old_dup dup
def dup
self.old_dup rescue self
end
end

I would certainly not like that, either.

The OPs proposal, however, was to modify the behavior only for a
particular set of built-in immutable objects:
[NilClass, FalseClass, TrueClass, Fixnum, Symbol]

(Though, if you ‘fix’ Fixnum like this, you’ll need to ‘fix’ Bignum
and Float and others as well to support duping.)


#19

On Sat, 17 Feb 2007 removed_email_address@domain.invalid wrote:

You could conceivably have a case where an object can’t be dup’d, but is
mutable, like a singleton class. If dup returns self, then you could end up
changing the object when you didn’t want to. It’s probably not an everyday
problem… but I still don’t like the idea of having to remember that “dup”
means “dup or self”.

you already do though?

harp:~ > cat a.rb

crappy

h = { :point => ‘we already have’ }

h2 = h.dup

h2[:point] << ’ that problem’

p h[:point]

crappy

h = { :point => ‘we already have’ }

h2 = h.clone

h2[:point] << ’ that problem’

p h[:point]

sledge hammer

h = { :point => ‘an imperfect solution’ }

mcp = lambda{|obj| Marshal.load(Marshal.dump(obj))}

h2 = mcp[ h ]

h2[:point] << ’ is to use marshal’

p h[:point]

harp:~ > ruby a.rb
“we already have that problem”
“we already have that problem”
“an imperfect solution”

-a


#20

On Sat, 17 Feb 2007 removed_email_address@domain.invalid wrote:

I don’t consider returning a shallow copy the same as returning self.
It’s not the same object as the receiver of the “dup” message.

you may not. neither do i. nevetheless it quite true that one must
understand completey the rhs in

d = obj.dup

in order to use d effectively.

-a