Possible bug with Range#include?

bbiker · June 9, 2007, 4:44am

Given the following range: (“A”…“IV”)

irb(main):001:0> (“A”…“IV”).include?(“A”) => true

irb(main):002:0> (“A”…“IV”).include?(“I”) => true

irb(main):003:0> (“A”…“IV”).include?(“J”) => false

irb(main):004:0> (“A”…“IV”).include?(“Z”) => false

irb(main):004:0> (“A”…“IV”).include?(“AA”) => true

irb(main):005:0> (“A”…“IV”).include?(“IV”) => true

Note that letters J through Z return false. It appears that comparison
are only made against the first and last range values.

irb(main):001:0> “Z” <= “IV” => false

This is correct because letters J through Z come after I when
alphabetized.!!!

when the range is converted into an array

irb(main):003:0> r = (“A”…“IV”) => “A”…“IV”
irb(main):004:0> r_array = r.to_a => [“A”, “B”, “C”, “D”, “E”,
“F”, “G”, “H”, “I”, “J”, “K”, “L”, “M”, “N”, “O”, “P”, "
Q", “R”, “S”, “T”, “U”, “V”, “W”, “X”, “Y”, “Z”, “AA”, “AB”, “AC”,
“AD”, “AE”, “AF”,
“AG”, “AH”, “AI”, “AJ”, “AK”, “AL”, “AM”, “AN”, “AO”, “AP”, “AQ”,
“AR”, “AS”, “AT”, "
AU", “AV”, “AW”, “AX”, “AY”, “AZ”, “BA”, “BB”, “BC”, “BD”, “BE”, “BF”,
“BG”, “BH”, “B
I”, “BJ”, “BK”, “BL”, “BM”, “BN”, “BO”, “BP”, “BQ”, “BR”, “BS”, “BT”,
“BU”, “BV”, "BW
“, “BX”, “BY”, “BZ”, “CA”, “CB”, “CC”, “CD”, “CE”, “CF”, “CG”, “CH”,
“CI”, “CJ”, “CK”
, “CL”, “CM”, “CN”, “CO”, “CP”, “CQ”, “CR”, “CS”, “CT”, “CU”, “CV”,
“CW”, “CX”, “CY”,
“CZ”, “DA”, “DB”, “DC”, “DD”, “DE”, “DF”, “DG”, “DH”, “DI”, “DJ”,
“DK”, “DL”, “DM”,
“DN”, “DO”, “DP”, “DQ”, “DR”, “DS”, “DT”, “DU”, “DV”, “DW”, “DX”,
“DY”, “DZ”, “EA”, "
EB”, “EC”, “ED”, “EE”, “EF”, “EG”, “EH”, “EI”, “EJ”, “EK”, “EL”, “EM”,
“EN”, “EO”, “E
P”, “EQ”, “ER”, “ES”, “ET”, “EU”, “EV”, “EW”, “EX”, “EY”, “EZ”, “FA”,
“FB”, “FC”, "FD
“, “FE”, “FF”, “FG”, “FH”, “FI”, “FJ”, “FK”, “FL”, “FM”, “FN”, “FO”,
“FP”, “FQ”, “FR”
, “FS”, “FT”, “FU”, “FV”, “FW”, “FX”, “FY”, “FZ”, “GA”, “GB”, “GC”,
“GD”, “GE”, “GF”,
“GG”, “GH”, “GI”, “GJ”, “GK”, “GL”, “GM”, “GN”, “GO”, “GP”, “GQ”,
“GR”, “GS”, “GT”,
“GU”, “GV”, “GW”, “GX”, “GY”, “GZ”, “HA”, “HB”, “HC”, “HD”, “HE”,
“HF”, “HG”, “HH”, "
HI”, “HJ”, “HK”, “HL”, “HM”, “HN”, “HO”, “HP”, “HQ”, “HR”, “HS”, “HT”,
“HU”, “HV”, “H
W”, “HX”, “HY”, “HZ”, “IA”, “IB”, “IC”, “ID”, “IE”, “IF”, “IG”, “IH”,
“II”, “IJ”, "IK
", “IL”, “IM”, “IN”, “IO”, “IP”, “IQ”, “IR”, “IS”, “IT”, “IU”, “IV”]

irb(main):006:0> r_array.include?(“J”) => true
irb(main):007:0> r_array.include?(“Z”) => true

irb(main):008:0> r_array.sort!
irb(main):005:0> r_array.sort!
=> [“A”, “AA”, “AB”, “AC”, “AD”, “AE”, “AF”, “AG”, “AH”, “AI”, “AJ”,
“AK”, “AL”, “AM
, “AN”, “AO”, “AP”, “AQ”, “AR”, “AS”, “AT”, “AU”, “AV”, “AW”, “AX”,
“AY”, “AZ”, “B”,
“BA”, “BB”, “BC”, “BD”, “BE”, “BF”, “BG”, “BH”, “BI”, “BJ”, “BK”,
“BL”, “BM”, “BN”,
BO”, “BP”, “BQ”, “BR”, “BS”, “BT”, “BU”, “BV”, “BW”, “BX”, “BY”, “BZ”,
“C”, “CA”, "C
", “CC”, “CD”, “CE”, “CF”, “CG”, “CH”, “CI”, “CJ”, “CK”, “CL”, “CM”,
“CN”, “CO”, “CP
, “CQ”, “CR”, “CS”, “CT”, “CU”, “CV”, “CW”, “CX”, “CY”, “CZ”, “D”,
“DA”, “DB”, “DC”,
“DD”, “DE”, “DF”, “DG”, “DH”, “DI”, “DJ”, “DK”, “DL”, “DM”, “DN”,
“DO”, “DP”, “DQ”,
DR”, “DS”, “DT”, “DU”, “DV”, “DW”, “DX”, “DY”, “DZ”, “E”, “EA”, “EB”,
“EC”, “ED”, "E
", “EF”, “EG”, “EH”, “EI”, “EJ”, “EK”, “EL”, “EM”, “EN”, “EO”, “EP”,
“EQ”, “ER”, “ES
, “ET”, “EU”, “EV”, “EW”, “EX”, “EY”, “EZ”, “F”, “FA”, “FB”, “FC”,
“FD”, “FE”, “FF”,
“FG”, “FH”, “FI”, “FJ”, “FK”, “FL”, “FM”, “FN”, “FO”, “FP”, “FQ”,
“FR”, “FS”, “FT”,
FU”, “FV”, “FW”, “FX”, “FY”, “FZ”, “G”, “GA”, “GB”, “GC”, “GD”, “GE”,
“GF”, “GG”, "G
", “GI”, “GJ”, “GK”, “GL”, “GM”, “GN”, “GO”, “GP”, “GQ”, “GR”, “GS”,
“GT”, “GU”, “GV
, “GW”, “GX”, “GY”, “GZ”, “H”, “HA”, “HB”, “HC”, “HD”, “HE”, “HF”,
“HG”, “HH”, “HI”,
“HJ”, “HK”, “HL”, “HM”, “HN”, “HO”, “HP”, “HQ”, “HR”, “HS”, “HT”,
“HU”, “HV”, “HW”,
HX”, “HY”, “HZ”, “I”, “IA”, “IB”, “IC”, “ID”, “IE”, “IF”, “IG”, “IH”,
“II”, “IJ”, "I
“, “IL”, “IM”, “IN”, “IO”, “IP”, “IQ”, “IR”, “IS”, “IT”, “IU”, “IV”,
“J”, “K”, “L”,
M”, “N”, “O”, “P”, “Q”, “R”, “S”, “T”, “U”, “V”, “W”, “X”, “Y”, “Z”]

Note that when the array is sorted, the letters J through Z windup at
the end of array and not is succession order

If this is in fact a bug , please forward this to the appropriate
maillist. I am a newbie and have no idea on how to and where to submit
this.

Thank You

[email protected]

bbiker · June 9, 2007, 4:22pm

Hi,
definitely NOT a bug
read this plz:
http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/168231

bbiker · June 9, 2007, 4:35pm

On Sat, Jun 09, 2007 at 11:20:16PM +0900, charon wrote:

Hi,
definitely NOT a bug
read this plz: http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/168231
Well, it not being a bug is arguable. It’s definitely not an
unintentional bug.

bbiker · June 9, 2007, 9:05pm

On Jun 9, 2007, at 10:34 AM, Logan C. wrote:

On Sat, Jun 09, 2007 at 11:20:16PM +0900, charon wrote:

Hi,
definitely NOT a bug
read this plz: http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/
ruby-talk/168231
Well, it not being a bug is arguable. It’s definitely not an
unintentional bug.

No, no argument at all. There is however conflicting information in
the Pickaxe that describes Range#member? as having different behavior
that #include? (although ri Range#member? shows that #=== #include?
and #member? are the same).

If the OP wants behavior like the Pickaxe (now incorrectly)
describes, you can always do:

class Range
def member? item
any? {|x| item == x}
end
end

irb> (‘A’…‘IV’).member?(‘J’)
=> true
irb> (1…10).include?(5.5)
=> true
irb> (1…10).member?(5.5)
=> false

Although the performance for testing a “big” Range and an item close
to the #end might be undesirable.

Of course, that does beg the question of whether #include? (#===) or
#member? is used by libraries. Is there a common way to note which
of a set of names is used when there are aliases? It seems like if
one were to treat #begin, #include?, and #end as a set, then #first,
#member?, and #last would be a similar set (currently all aliases)
and a redefinition of #member? could be reasonably expected to
redefine #last to be the final value returned by #each. For a Range
where #exclude_end? is true, #last != #end would be justifiable.

-Rob

Rob B. http://agileconsultingllc.com
[email protected]

bbiker · June 10, 2007, 11:52pm

Hi –

On Mon, 11 Jun 2007, bbiker wrote:

I am the OP. Subsequent to my post I also found that #include? would
return true even though an item is not included (no a member) in a
range.

irb(main):039:0> (‘A’…‘IV’).include?(‘A:’) => true # Note A: is
NOT a member of ‘A’…‘IV’

Now in mho, any method that produces incorrect results is buggy!
whether intentionally or not.

It’s a matter of ASCII sorting. “A:” sorts higher than “A” and lower
than “IV”, so it’s within that range.

Ranges can be converted into arrays (a little too easily, I sometimes
think), but fundamentally they’re not collections. So it’s possible
for this to be true:

range.include?(val)

while this isn’t true:

range.to_a.include?(val)

There are even ranges that can’t be represented as arrays at all
(ranges between floats, for example), and they still have the concept
of inclusion. The array thing is really just a convenience, offered
where possible but not meant to override the basic idea of the range.

David

bbiker · June 11, 2007, 2:49am

On Jun 10, 2007, at 5:40 PM, bbiker wrote:

 any? {|x| item == x}
Although the performance for testing a “big” Range and an item close
NOT a member of ‘A’…‘IV’
rng_arr = rng.to_a
Here is my implementation.
end
t = “I5”
 least one of the collection members is not +false+ or +nil+.
My question is: which solution is better in term of memory usage and
execution time?

Thank You.

[email protected]

Since the #any? method is provided by Enumerable, it uses the #each
method supplied by the underlying class (i.e., Range). I’d expect
that your redefinition of Range#include? is equivalent to the use of
any? where the block evaluates the x==item as both will return as
soon as a value matches.

It does go directly to the question of behavior. Like David Black
said, Ranges are fundamentally not collections even though some kinds
of Ranges are easily converted to an Array.

-Rob

Rob B. http://agileconsultingllc.com
[email protected]

bbiker · June 10, 2007, 11:41pm

On Jun 9, 3:03 pm, Rob B. [email protected] wrote:

No, no argument at all. There is however conflicting information in
end
to the #end might be undesirable.
-Rob

Rob B. http://agileconsultingllc.com
[email protected]

I am the OP. Subsequent to my post I also found that #include? would
return true even though an item is not included (no a member) in a
range.

irb(main):039:0> (‘A’…‘IV’).include?(‘A:’) => true # Note A: is
NOT a member of ‘A’…‘IV’

Now in mho, any method that produces incorrect results is buggy!
whether intentionally or not.
I do not believe that any one intentionally writes buggy code.

As I needed to find a reliable method to determine whether an item is
included in a range object, I tried several schemes none of which were
reliable. Finally, I tried the following:

rng = ‘A’…‘IV’
rng_arr = rng.to_a
rng_arr.include?(‘Z’)
now the last statement returns the correct answer.

Additional tests showed that it produces consistently correct answers.

But I felt that it was very inefficient in memory usage and execution.

So I so I redefined Range#include?. Prior to my posting this messaage,
I read Rob B…

Here is my implementation.

class Range

my solution

def include?(t)
inc = false
self.each do |m|
if m == t
inc = true
break
end
end
inc
end

Rob B.'s solution

def member?(item)
any? { |x| item == x }
end
end

rng = “A”…“IV”
t = “I5”
range = “A”…“IV”

puts range.include?(“BZ”) => true
puts range.member?(“BZ”) => true

puts range.include?(“IW”) => false
puts range.member?(‘IW’) => false

puts range.include?(“A:”) => false
puts range.member?(“A:”) => false

puts range.include?(“K”) => true
puts range.include?(“K”) => true

As you can see, both solutions provide correct answers.
Rob B.'s solution is obviously simpler than my solution. So it
should the preferred solution … Occam’s Razor.
However, I do not know the ‘behind the scene magic’ that any? does
with regards to a Range object.

C:\Documents and Settings\Owner>qri any?

Enumerable#any?
enum.any? [{|obj| block } ] => true or false

 Passes each element of the collection to the given block. The
 method returns +true+ if the block ever returns a value other

than
+false+ or +nil+. If the block is not given, Ruby adds an
implicit
block of +{|obj| obj}+ (that is +any?+ will return +true+ if at
least one of the collection members is not +false+ or +nil+.

    %w{ ant bear cat}.any? {|word| word.length >= 3}   #=> true
    %w{ ant bear cat}.any? {|word| word.length >= 4}   #=> true
    [ nil, true, 99 ].any?                             #=> true

Does it create an array to get the collection?
If yes, then my method would be the preferred one. The case worse
scenerio is when the test item is not included in the range.
Efficiency decreases as the item’s location is toward the far end.

My question is: which solution is better in term of memory usage and
execution time?

Thank You.