Bizarre Range behavior

Can someone please explain this behavior in ruby (1.8.6p111):

(“2”…“8”).to_a
=> [“2”, “3”, “4”, “5”, “6”, “7”]

(“2”…“8”).to_a
=> [“2”, “3”, “4”, “5”, “6”, “7”, “8”]

(“2”…“9”).to_a
=> [“2”, “3”, “4”, “5”, “6”, “7”, “8”, “9”]

(“2”…“10”).to_a
=> []

(“2”…“11”).to_a
=> []

(“1”…“11”).to_a
=> [“1”, “2”, “3”, “4”, “5”, “6”, “7”, “8”, “9”, “10”, “11”]

Cheers,
Scott

On Aug 4, 1:47 pm, Scott B. [email protected] wrote:

(“2”…“11”).to_a
=> []
(“1”…“11”).to_a

=> [“1”, “2”, “3”, “4”, “5”, “6”, “7”, “8”, “9”, “10”, “11”]

It gets better.

>> ("100".."11").to_a
=> ["100"]

It seems you’re running not so much into strange Range behavior as
strange String behavior in certain numeric circumstances. Or maybe a
combination of strange Range and String behvior. If you want the
ranges to make more sense, use actual numbers.

>> (2..11).to_a
=> [2, 3, 4, 5, 6, 7, 8, 9, 10, 11]

If you want strings in the result, you can get that with a little bit
of work.

>> (2..11).to_a.map { |x|  x.to_s }
=> ["2", "3", "4", "5", "6", "7", "8", "9", "10", "11"]

On Wed, 5 Aug 2009, Scott B. wrote:

(“2”…“11”).to_a
=> []

(“1”…“11”).to_a
=> [“1”, “2”, “3”, “4”, “5”, “6”, “7”, “8”, “9”, “10”, “11”]

It’s because you’re using strings – “11” comes before “2”, hence the
failure, because it’s an invalid range, just as if you had (11 … 2) is
invalid.

Matt

Matt, that doesn’t explain why “1”…“11” works and “2”…“11” doesn’t
work.

Scott

Matthew K. Williams wrote:

On Wed, 5 Aug 2009, Scott B. wrote:

(“2”…“11”).to_a
=> []

(“1”…“11”).to_a
=> [“1”, “2”, “3”, “4”, “5”, “6”, “7”, “8”, “9”, “10”, “11”]

It’s because you’re using strings – “11” comes before “2”, hence the
failure, because it’s an invalid range, just as if you had (11 … 2) is
invalid.

Matt

Ah, I should clarify that. When ruby interprets “11” as an integer 11
for “1”…“11”, then why doesn’t it do the same when it’s “2”…“11”?

Scott

Scott B. wrote:

Matt, that doesn’t explain why “1”…“11” works and “2”…“11” doesn’t
work.

Scott

Matthew K. Williams wrote:

On Wed, 5 Aug 2009, Scott B. wrote:

(“2”…“11”).to_a
=> []

(“1”…“11”).to_a
=> [“1”, “2”, “3”, “4”, “5”, “6”, “7”, “8”, “9”, “10”, “11”]

It’s because you’re using strings – “11” comes before “2”, hence the
failure, because it’s an invalid range, just as if you had (11 … 2) is
invalid.

Matt

On Wed, 5 Aug 2009, Scott B. wrote:

Matt, that doesn’t explain why “1”…“11” works and “2”…“11” doesn’t
work.

irb(main):015:0> “1” < “11”
=> true
irb(main):016:0> “2” < “11”
=> false

irb(main):021:0> “11” < “2”
=> true

This is true because it’s comparing strings to get the range – it
compares the first character of each string, then stops when it can’t go
any further.

Try this for an example of how the expansion is occurring:

(“a”…“cat”.to_a

(I’m only putting a portion of it here)

=> [“a”, “b”, “c”, “d”, “e”, “f”, “g”, “h”, “i”, “j”, “k”, “l”, “m”,
“n”,
“o”, “p”, “q”, “r”, “s”, “t”, “u”, “v”, “w”, “x”, “y”, “z”, “aa”, “ab”,

“caa”, “cab”, “cac”, “cad”, “cae”, “caf”, “cag”, “cah”, “cai”, “caj”,
“cak”, “cal”, “cam”, “can”, “cao”, “cap”, “caq”, “car”, “cas”, “cat”]

In string order, it’s going to compare strings of length 1 first, then
strings of length 2, etc… Here’s another example (with an attempt at
an
explanation):

irb(main):019:0> (“11” … “2”).to_a
=> [“11”]

As we’ve seen before, “11” < “2”, so it’s a part of the range, but then
it
stops, we’re done.

Matt

On Aug 4, 2009, at 3:15 PM, Matthew K. Williams wrote:

=> []

(“2”…“11”).to_a
=> []
(“1”…“11”).to_a
=> [“1”, “2”, “3”, “4”, “5”, “6”, “7”, “8”, “9”, “10”, “11”]

It’s because you’re using strings – “11” comes before “2”, hence
the failure, because it’s an invalid range, just as if you had
(11 … 2) is invalid.

Matt

Well, it certainly isn’t invalid. You can easily have a Range where
the end is less than the begin value.

r = 3…-1
=> 3…-1
irb> r.to_a
=> []
irb> “hello”[r]
=> “lo”

-Rob

Rob B. http://agileconsultingllc.com
[email protected]

On Wed, 5 Aug 2009, Rob B. wrote:

It gets better.

(“100”…“11”).to_a
=> [“100”]

Now, that one is odd. I’d have predicted a result of:
=> [“100”, “101”, “102”, “103”, “104”, “105”, “106”, “107”, “108”, “109”]
on the basis of staring with “100” and applying #succ until the value was

“11” like this loop does:

It’s doing a comparison of the strings – it has to do with the
length of the string. “100” is longer than “11”, it also happens to be
less characters (and, based on #succ, it’s “less”).

In order to find the range, it’s going to compare the two strings –

  • it compares for the string lengths to get whether the beginning is
    less
    than the end

  • It then uses #succ to try to expand the range, but since “100” has
    more
    characters than “11”, it stops…

Hope I’ve not muddied it too much…

Matt

On Aug 4, 2009, at 3:04 PM, Yossef M. wrote:

=> []

(“2”…“11”).to_a
=> []
(“1”…“11”).to_a

=> [“1”, “2”, “3”, “4”, “5”, “6”, “7”, “8”, “9”, “10”, “11”]

Well, you need to think about String#succ when the Range endpoints are
String.

It gets better.

(“100”…“11”).to_a
=> [“100”]

Now, that one is odd. I’d have predicted a result of:
=> [“100”, “101”, “102”, “103”, “104”, “105”, “106”, “107”, “108”,
“109”]
on the basis of staring with “100” and applying #succ until the value
was >“11” like this loop does:

a = []
v = “100”
loop do
break if v > “11”
a << v
v = v.succ
end
p a

This loop produced the “right” result for “2”…“11” (namely an empty
array) so the actual result defies (my) explanation.

of work.

(2…11).to_a.map { |x| x.to_s }
=> [“2”, “3”, “4”, “5”, “6”, “7”, “8”, “9”, “10”, “11”]


-yossef

Of course, you can also do things like:

(“a”…“ah”).to_a
=> [“a”, “b”, “c”, “d”, “e”, “f”, “g”, “h”, “i”, “j”, “k”, “l”, “m”,
“n”, “o”, “p”, “q”, “r”, “s”, “t”, “u”, “v”, “w”, “x”, “y”, “z”, “aa”,
“ab”, “ac”, “ad”, “ae”, “af”, “ag”, “ah”]

Which might help label your spreadsheet columns.

-Rob

Rob B. http://agileconsultingllc.com
[email protected]

On Wed, 5 Aug 2009, Rob B. wrote:

Well, it certainly isn’t invalid. You can easily have a Range where the end
is less than the begin value.

r = 3…-1
=> 3…-1
irb> r.to_a
=> []
irb> “hello”[r]
=> “lo”

I guess the code for substring treats it differently than #to_a – just
taking the bounds. Huh. That’s pretty interesting. Learn something
every day. Makes sense when I stop to think about it, though.

Just don’t try “hello”[3,-1]…

I need to read the rdocs more often…
Matt

On Aug 4, 2009, at 3:45 PM, Matthew K. Williams wrote:

“11” like this loop does:

  • It then uses #succ to try to expand the range, but since “100” has
    more characters than “11”, it stops…

Hope I’ve not muddied it too much…

Matt

Well, the Range#to_a is actually Enumerable#to_a and uses Range#each
defined in range.c

After checking that the beginning of the range responds to :succ and
if it is a Fixnum (which are special), it finds that the Range.begin
is a String:

 else if (TYPE(beg) == T_STRING) {

VALUE args[5];
long iter[2];

args[0] = beg;
args[1] = end;
args[2] = range;
iter[0] = 1;
iter[1] = 1;
rb_iterate(str_step, (VALUE)args, step_i, (VALUE)iter);
}

str_step calls rb_str_upto defined in string.c

VALUE
rb_str_upto(VALUE beg, VALUE end, int excl)
{
VALUE current, after_end;
ID succ = rb_intern(“succ”);
int n;

 StringValue(end);
 n = rb_str_cmp(beg, end);
 if (n > 0 || (excl && n == 0)) return beg;
 after_end = rb_funcall(end, succ, 0, 0);
 current = beg;
 while (!rb_str_equal(current, after_end)) {

rb_yield(current);
if (!excl && rb_str_equal(current, end)) break;
current = rb_funcall(current, succ, 0, 0);
StringValue(current);
if (excl && rb_str_equal(current, end)) break;
StringValue(current);
if (RSTRING_LEN(current) > RSTRING_LEN(end) || RSTRING_LEN(current)
== 0)
break;
}

 return beg;

}

Now, not having read a lot of Ruby’s C code, I’m not sure what some
bits are for (like calling StringValue(current) so much), but it does
ultimately behave almost like Matt said. The difference being that
the rb_yield(current) has already happened once before the length
check (RSTRING_LEN(current) > RSTRING_LEN(end)). I think the
RSTRING_LEN(current)==0 is there to catch “”.succ == “”, but that just
means that (“”…any).to_a is [“”] and yet (“”…“”).to_a is [] (because
after_end will be “” and the loop is never entered).

So it’s the odd situation that String is given some special treatment
and has the unusual property that there are strings a,b such that:
a < b && a.length > b.length

Knowing this, here’s an even more bizzare-looking example:

irb> “19”.succ
=> “20”
irb> (“2”…“19”).to_a
=> []
irb> (“2”…“20”).to_a
=> [“2”, “3”, “4”, “5”, “6”, “7”, “8”, “9”, “10”, “11”, “12”, “13”,
“14”, “15”, “16”, “17”, “18”, “19”]

-Rob

Rob B. http://agileconsultingllc.com
[email protected]

[email protected] writes:

Any opinion? I already made a patch for trunk.

I vote against. If people want numeric ranges, it’s their job to use
numerics, not magically convert stringy numbers into actual numbers.
This
isn’t Perl after all.

Regards,

Dan

Yukihiro M. wrote:

What if I sprinkle more magic to the language and change String#upto
to generate numerical sequences when all characters in edges are
digits, so that

irb> (“2”…“19”).to_a
=> [“2”, “3”, “4”, “5”, “6”, “7”, “8”, “9”, “10”, “11”, “12”, “13”,
“14”, “15”, “16”, “17”, “18”, “19”]
irb> (“2”…“20”).to_a
=> [“2”, “3”, “4”, “5”, “6”, “7”, “8”, “9”, “10”, “11”, “12”, “13”,
“14”, “15”, “16”, “17”, “18”, “19”, “20”]

Any opinion?

-1 for added complexity with little benefit

2009/8/5 Brian C. [email protected]

“14”, “15”, “16”, “17”, “18”, “19”, “20”]

Any opinion?

-1 for added complexity with little benefit

I’m also against this. I prefer explicit type conversions here: changing
behaviour because a string happens to look like a number will likely
cause
more problems than it solves. In fact this kind of thing shows up in
JavaScript and it usually masks bugs where the developer has failed to
properly handle user input.

If this change were to go ahead, I’d also argue for changing String#+ to
recognise numbers, which might also mean changing Numeric#+ for
symmetry.

Hi,

In message “Re: Bizarre Range behavior”
on Wed, 5 Aug 2009 05:32:37 +0900, Rob B.
[email protected] writes:

|Knowing this, here’s an even more bizzare-looking example:
|
|irb> “19”.succ
|=> “20”
|irb> (“2”…“19”).to_a
|=> []
|irb> (“2”…“20”).to_a
|=> [“2”, “3”, “4”, “5”, “6”, “7”, “8”, “9”, “10”, “11”, “12”, “13”,
|“14”, “15”, “16”, “17”, “18”, “19”]

What if I sprinkle more magic to the language and change String#upto
to generate numerical sequences when all characters in edges are
digits, so that

irb> (“2”…“19”).to_a
=> [“2”, “3”, “4”, “5”, “6”, “7”, “8”, “9”, “10”, “11”, “12”, “13”,
“14”, “15”, “16”, “17”, “18”, “19”]
irb> (“2”…“20”).to_a
=> [“2”, “3”, “4”, “5”, “6”, “7”, “8”, “9”, “10”, “11”, “12”, “13”,
“14”, “15”, “16”, “17”, “18”, “19”, “20”]

Any opinion? I already made a patch for trunk.

          matz.

2009/8/5 Daniel B. [email protected]:

In message “Re: Bizarre Range behavior”
|=> [“2”, “3”, “4”, “5”, “6”, “7”, “8”, “9”, “10”, “11”, “12”, “13”,
=> [“2”, “3”, “4”, “5”, “6”, “7”, “8”, “9”, “10”, “11”, “12”, “13”,
“14”, “15”, “16”, “17”, “18”, “19”, “20”]

Any opinion? I already made a patch for trunk.

I vote against. If people want numeric ranges, it’s their job to use
numerics, not magically convert stringy numbers into actual numbers. This
isn’t Perl after all.

I strongly agree. Typing .to_i isn’t too hard and it makes clear what
is intended.

Kind regards

robert

Hi,

In message “Re: Bizarre Range behavior”
on Wed, 5 Aug 2009 20:14:45 +0900, Brian C.
[email protected] writes:

|Yukihiro M. wrote:
|> What if I sprinkle more magic to the language and change String#upto
|> to generate numerical sequences when all characters in edges are
|> digits,

|-1 for added complexity with little benefit

I admit it’s a dark art of Perl-ish, but String#upto and String#succ
that a range of strings calls already have astonishingly Perl-ish
complexity (and I don’t think we can remove the magic), and I prefer
hard-boiled magic to half-cooked current one. Actually, I think this
magic is the one I should have invented and added years ago.

          matz.

Hi –

On Wed, 5 Aug 2009, Yukihiro M. wrote:

|-1 for added complexity with little benefit

I admit it’s a dark art of Perl-ish, but String#upto and String#succ
that a range of strings calls already have astonishingly Perl-ish
complexity (and I don’t think we can remove the magic), and I prefer
hard-boiled magic to half-cooked current one. Actually, I think this
magic is the one I should have invented and added years ago.

If you make this change, how would you then accomplish the old
version? In other words, if you wanted:

“2”…“19”

to obey ASCII/character code logic, would there still be a way?

That would be my concern: that right now you can do either (even
though one is a little harder), but the change might make it
impossible (??) to do the one we can do now.

David

On Aug 5, 2009, at 1:27 AM, Daniel B. wrote:

[email protected] writes:

Any opinion? I already made a patch for trunk.

I vote against. If people want numeric ranges, it’s their job to use
numerics, not magically convert stringy numbers into actual numbers.
This isn’t Perl after all.

I agree. I’m against the change.

James Edward G. II

On Aug 5, 2009, at 12:55 AM, Yukihiro M. wrote:

|irb> (“2”…“19”).to_a
=> [“2”, “3”, “4”, “5”, “6”, “7”, “8”, “9”, “10”, “11”, “12”, “13”,
“14”, “15”, “16”, “17”, “18”, “19”]
irb> (“2”…“20”).to_a
=> [“2”, “3”, “4”, “5”, “6”, “7”, “8”, “9”, “10”, “11”, “12”, “13”,
“14”, “15”, “16”, “17”, “18”, “19”, “20”]

Any opinion? I already made a patch for trunk.

          matz.

I’d actually prefer less magic here. (And I hope that your second
example with “2”…“20” wouldn’t include “20” since the range excludes
its end.)

The surprising aspect is that the Range#to_a can give an array that
has just the #begin. String#upto perhaps needs more magic to know that
#succ will eventually work (“9”.upto(“10”) as well as
“cat”.upto(“bird”))

-Rob

Rob B. http://agileconsultingllc.com
[email protected]