Forum: Ruby dirty ranges

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
34855afe00a2641bb67eb813c16c62a3?d=identicon&s=25 unknown (Guest)
on 2006-02-20 14:55
(Received via mailing list)
I'm a new Ruby user (currently at page 68 of Programming Ruby !) and
having found something weird, I wonder if either a) "you have already
found a bug, report it" or b) "yeah, yeah, we all know that this is
a bit weird, but it is not a problem in practice".

It seems that you can do destructive operations on the minimum element
of a range, but not on the maximum element (well, you can, but it
does not have any effect):

irb(main):001:0> rng="a".."z"
=> "a".."z"
irb(main):002:0> rng.min[0]="b"
=> "b"
irb(main):003:0> rng.max[0]="y"
=> "y"
irb(main):004:0> rng
=> "b".."z"

This just doesn't seem right.

Actually this was the second thing I found that doesn't seem right.
The first was that the first element is shared when you convert
a range into an array (again, the last one is different):

irb(main):005:0> arr=rng.to_a
=> ["b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n",
"o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z"]
irb(main):006:0> rng.min[0]="c"
=> "c"
irb(main):007:0> rng.max[0]="x"
=> "x"
irb(main):008:0> arr
=> ["c", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n",
"o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z"]

This is really dirty, but at least this was to be expected from the
specifications ("All you need to be able to make ranges is 'succ' and
'<=>'" -- there is no talk of a deep copy as a requirement.)

Comments ?

Dirk van Deun
430ea1cba106cc65b7687d66e9df4f06?d=identicon&s=25 David Vallner (Guest)
on 2006-02-21 02:21
(Received via mailing list)
DÅ?a Pondelok 20 Február 2006 14:53 Dirk van Deun napísal:
> => "a".."z"
> The first was that the first element is shared when you convert
> => ["c", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n", "o",
> "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z"]
>
> This is really dirty, but at least this was to be expected from the
> specifications ("All you need to be able to make ranges is 'succ' and
> '<=>'" -- there is no talk of a deep copy as a requirement.)
>

Range code seems to enumerate and determine the maximum by generating
all
successors lesser or equal, or lesser than the Range endpoint. For these
examples to "work", it would also require to check whether the last
generated
element is equal to the endpoint, and return the endpoint object if it
was.

That said, I think this situation is similar to the one with String keys
in a
hash. Where immutable objects would be enforced otherwise, Ruby gives
you the
responsibility for if you choose to change them. Maybe this behaviour
should
instead be documented as with the Hash class, if it isn't already; I
can't
imagine where this could only be worked around with a noticeable kludge.

David Vallner
851246810c70dbfcc1815c636b054562?d=identicon&s=25 George Ogata (Guest)
on 2006-02-21 04:40
(Received via mailing list)
dvandeun@vub.spam-me-not.ac.be (Dirk van Deun) writes:

> irb(main):004:0> rng
> => "b".."z"

You can modify the endpoints if you reference them using #begin and
#end:

irb(main):001:0> r = 'a'..'z'
=> "a".."z"
irb(main):002:0> r.begin << '!'
=> "a!"
irb(main):003:0> r.end << '!'
=> "z!"
irb(main):004:0> r
=> "a!".."z!"

#begin and #end are direct accessors to the endpoint objects, whereas
#max is calculated, taking into account end-closedness.  It should not
be surprising, then, that #max and #end return different objects.

But I suggest not modifying range endpoints at all; Ranges themselves
are immutable, so changing their value indirectly like that is kinda
going against the grain.  And of course, if you ever modify your #end
while iterating over the range, it's nasal demons.
430ea1cba106cc65b7687d66e9df4f06?d=identicon&s=25 David Vallner (Guest)
on 2006-02-21 05:44
(Received via mailing list)
DÅ?a Utorok 21 Február 2006 04:38 George Ogata napísal:
> But I suggest not modifying range endpoints at all; Ranges themselves
> are immutable, so changing their value indirectly like that is kinda
> going against the grain.  And of course, if you ever modify your #end
> while iterating over the range, it's nasal demons.
>

Pffft. Doesn't even flinch.

ruby <<EOF
rng1 = ("a".."g")
rng1.each { |char|
  if char == "d"
    rng1.end[0] = "j"
  end
  puts char
}
rng2 = ("a".."j")
rng2.each { |char|
  if char == "g"
    rng2.end[0] = "d"
  end
  puts char
}
END

Outputs:

a
b
c
d
e
f
g
a
b
c
d
e
f
g
h
i
j


Of course, I have absolutely NO idea at all why, and don't particularly
feel
like reading Ruby core source.

David Vallner
70c8da82d09d3866222976ab8978133c?d=identicon&s=25 Daniel Nugent (Guest)
on 2006-02-21 07:42
(Received via mailing list)
Hrmmm.. the second example that Dirk gives is pretty ugly.  I should
think that you could simply call clone on the begining value of the
range and get a more desirable result.  I don't know if a deep-copy
would be required, and if it were, it's only a
Marshal.load(Marshal.dump(obj)) away, right?
5befe95e6648daec3dd5728cd36602d0?d=identicon&s=25 Robert Klemme (Guest)
on 2006-02-21 10:56
(Received via mailing list)
David Vallner <david@vallner.net> wrote:

> That said, I think this situation is similar to the one with String
> keys in a hash. Where immutable objects would be enforced otherwise,
> Ruby gives you the responsibility for if you choose to change them.
> Maybe this behaviour should instead be documented as with the Hash
> class, if it isn't already; I can't imagine where this could only be
> worked around with a noticeable kludge.

Having said that I can't see where modifying range members like this is
actually needed.  IMHO it's a bad idea to do so.

Kind regards

    robert
34855afe00a2641bb67eb813c16c62a3?d=identicon&s=25 unknown (Guest)
on 2006-02-21 13:26
(Received via mailing list)
For the people who remarked (quite sensibly, of course) that you just
shouldn't do that, tinker with the endpoints of a range: it works
the other way too, of course:

irb(main):001:0> rng="a".."z"
=> "a".."z"
irb(main):002:0> arr=rng.to_a
=> ["a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m",
"n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z"]
irb(main):003:0> arr[0][0]="z"
=> "z"
irb(main):004:0> rng
=> "z".."z"

Something like this may be more likely to be a real problem.

The begin/end versus min/max I obviously didn't know, but then again,
if max is an ad-hoc constructed object, shouldn't min be too, if
only for symmetry ?

Hacker-Dirk likes Ruby, but Computer-Scientist-Dirk tends to be wary
of systems with irregularities like these...

Dirk van Deun
430ea1cba106cc65b7687d66e9df4f06?d=identicon&s=25 David Vallner (Guest)
on 2006-02-22 01:12
(Received via mailing list)
DÅ?a Utorok 21 Február 2006 13:23 Dirk van Deun napísal:
> => "z"
> irb(main):004:0> rng
> => "z".."z"
>

Commandment <number> of not causing obscure bugs: thou shalt not clobber
shared data without due reason.

I can't imagine why I'd model any functionality with in-place
modification of
something two levels deep in a data structure I have as input. The whole
approach is bound to cause problems sooner or later if you really don't
know
what you're doing, in this specific case it's just a bit more visible
because
it's obvious that Range object makes little sense.

> The begin/end versus min/max I obviously didn't know, but then again,
> if max is an ad-hoc constructed object, shouldn't min be too, if
> only for symmetry ?
>

How'd you do it? The only requirement Range places on its beginning
point is
comparability and generating successors (if enumerating the members).

No cloneability mentioned anywhere, even if it would happen to help in
case of
Strings. For example, the problem can't appear with Fixnums, and you
can't
dup those or clone those - IIRC o.object_id != o.dup.object_id must hold
true
for the operation to be correct, same for clone.

A lose / lose situation basically, but I prefer the currently used
option that
puts things in my hands.

David Vallner
D521c0db52fe0b14d799dd669d536098?d=identicon&s=25 Tony Mobily (Guest)
on 2006-02-22 03:24
(Received via mailing list)
Hi,

I followed the range discussion. Before now, I had honestly thought:
c'mon, it's not _that_ much of a problem!
Then I saw:

> irb(main):001:0> rng="a".."z"
> => "a".."z"
> irb(main):002:0> arr=rng.to_a
> => ["a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m",
> "n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z"]
> irb(main):003:0> arr[0][0]="z"
> => "z"
> irb(main):004:0> rng
> => "z".."z"

OUCH!!!
I agree 1000000% with this statement:

> Something like this may be more likely to be a real problem.

Oh yes.

Merc.
851246810c70dbfcc1815c636b054562?d=identicon&s=25 George Ogata (Guest)
on 2006-02-22 12:55
(Received via mailing list)
David Vallner <david@vallner.net> writes:

> rng1 = ("a".."g")
>   end
> e
> i
> j
>
>
> Of course, I have absolutely NO idea at all why, and don't particularly feel
> like reading Ruby core source.
>
> David Vallner

You're right; modifying #end is okay.  Not documented, though.  Also
note that modifying #begin can break a loop.  Whether or not this is
counterintuitive depends on the person, I guess.

g@crash:~$ irb
irb(main):001:0> r = 'a'..'z'
=> "a".."z"
irb(main):002:0> r.each{|s| s << '!'; puts s}
a!
=> "a!".."z"

This is another face of Dirk/Tony's concern.  Perhaps the first
element should be cloned for symmetry and safety.  But how: #dup,
#clone, something else...?  You'd also be adding a new requirement
that non-immediate range elements must be copyable.  Hmmm...
34855afe00a2641bb67eb813c16c62a3?d=identicon&s=25 unknown (Guest)
on 2006-02-22 15:24
(Received via mailing list)
: This is another face of Dirk/Tony's concern.  Perhaps the first
: element should be cloned for symmetry and safety.  But how: #dup,
: #clone, something else...?  You'd also be adding a new requirement
: that non-immediate range elements must be copyable.  Hmmm...

The requirement could be weakened a bit, because you do not need to
clone range elements immediately.  The begin and end could stay
uncloned; and cloning could be delayed until a min is asked for; so
that the min would be an ad hoc calculated value like the max.

Methods like to_a and each would then need to use min and max, not
begin and end, but they probably already are.  "Safe" methods,
like ===, could use begin and end in their implementation, so that
ranges of non-copyable elements would still be possible and
useful.  (But only to be used in "safe" circumstances.)

Of course, the weakened solution would not prevent the following
from happening, but this is really "asking for it":

irb(main):001:0> a="a"
=> "a"
irb(main):002:0> z="z"
=> "z"
irb(main):003:0> rng=a..z
=> "a".."z"
irb(main):004:0> a[0]="b"
=> "b"
irb(main):005:0> rng
=> "b".."z"

Accidents that happen indirectly via innocuous-looking to_a and each
calls would be prevented.

Dirk van Deun
851246810c70dbfcc1815c636b054562?d=identicon&s=25 George Ogata (Guest)
on 2006-02-22 18:41
(Received via mailing list)
dvandeun@vub.spam-me-not.ac.be (Dirk van Deun) writes:

> : This is another face of Dirk/Tony's concern.  Perhaps the first
> : element should be cloned for symmetry and safety.  But how: #dup,
> : #clone, something else...?  You'd also be adding a new requirement
> : that non-immediate range elements must be copyable.  Hmmm...
>
> The requirement could be weakened a bit, because you do not need to
> clone range elements immediately.  The begin and end could stay
> uncloned; and cloning could be delayed until a min is asked for; so
> that the min would be an ad hoc calculated value like the max.

That's how I figured you would do it.  One of the most common
operations to do on a Range, though, is to traverse it, so it'd be a
limited-use range if you use non-copyable objects.  Perhaps I
should've said "requirement for traversal" -- you could still include?
and friends.
This topic is locked and can not be replied to.