Forum: Ruby Re: string range membership

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
warrenbrown (Guest)
on 2005-11-23 17:22
(Received via mailing list)
Ara,

>> ruby -v -e "p(('1'..'10').to_a)"
>> ruby 1.8.2 (2004-12-25) [i386-mswin32]
>> ["1", "2", "3", "4", "5", "6", "7", "8", "9", "10"]
>>
>>    This shows a clear and unique mapping of the range
>> '1'..'10' into a set of strings.
>
> but where do '01', '001', and '0001' go?  they too,
> are in the set of strings.

    You completely lost me there.  '01' doesn't *go* anywhere.  That
string is not in the range '1'..'10', in the same way the 'x' is not in
the range 'a'..'n'.

    Don't let the fact that my example used strings that look like
numbers confuse the issue.  The issue is that a range of strings that
can be converted into a finite set, has a method to test for membership
in that range, that doesn't match values that are in the set.  Wow, that
sentence is even hard for *me* to follow.

    OK, let's take a different example to avoid all discussion of
integers and various string representations of them.

>ruby -v -e "p(('a'..'aa').to_a)"
ruby 1.8.2 (2004-12-25) [i386-mswin32]
["a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n",
"o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z", "aa"]

    Here we have a string range that has 27 "members".  Now:

>ruby -e "p(('a'..'aa').member?('a'))"
true
>ruby -e "p(('a'..'aa').member?('b'))"
false
...
>ruby -e "p(('a'..'aa').member?('z'))"
false
>ruby -e "p(('a'..'aa').member?('aa'))"
true

    Can this really be called correct behavior of the member?() method?
I can't see any tenable argument to say that it is.

    - Warren B.
ara.t.howard (Guest)
on 2005-11-23 18:43
(Received via mailing list)
On Thu, 24 Nov 2005, Warren B. wrote:

>    You completely lost me there.  '01' doesn't *go* anywhere.  That
> string is not in the range '1'..'10', in the same way the 'x' is not in
> the range 'a'..'n'.

says who?  ;-)  i may chose to define String#succ to do whatever i like
-
including the values '01', '001', and '0001'.

my point is simply that you seem to be merging the notion of ranges and
sets.
the range abstract to_a is determined by only a few things

   - the start and end points

   - the succ method of the start value and each successive succ value
     remember one could do this

       irb(main):003:0> class String; def succ; self == "1" ? 42 :
super; end; end
       => nil
       irb(main):004:0> "1".succ
       => 42

   - the spaceship operator for each succ value called against the
endpoint

because of this we cannot even safely call to_a on an arbitrary range.,
for instance

   irb(main):002:0> (42.0 .. 1.0).to_a
   TypeError: can't iterate from Float
    from (irb):2:in `each'
    from (irb):2:in `to_a'
    from (irb):2


in summary a range is nothing but a set of endpoints with some
abstract/duck-type-like methods that may or may not produce a set as a
__process__.  note that the set produced is not part of the range itself
and
can be dynamically altered or even be made to produce a different set
each
time:

   harp:~ > cat a.rb
   class Float
     def succ
       self + rand
     end
   end

   p((4.2 ... 42.0).to_a)

   harp:~ > ruby a.rb
   [4.2, 4.60303889967309, 5.57983848378295, 6.19446672151043,
6.92731328072508, 7.40446684874589, 7.79202463038348, 8.67552806421286,
9.42821837951244, 10.1988047216007, 11.1116769865281, 11.6169205995556,
11.9975653524073, 12.2256247650959, 12.8874200335378, 13.1557666607712,
13.6470070004444, 14.2172959192607, 15.0882979655236, 15.3487930162798,
15.9791460692026, 16.4321713791994, 17.0903318945661, 17.2967949864209,
18.2400722395741, 18.7286500286255, 19.7174743954199, 20.4528553779707,
20.953553149678, 21.0415866875269, 21.2924876748544, 22.2378099442685,
23.0076932295775, 23.0941582708386, 23.4748092012559, 23.5515124737304,
24.3463511761819, 24.6901201768951, 25.2541406207396, 26.0256212044938,
26.843159468986, 26.9579528629072, 27.01297383827, 27.7250436963749,
27.9017308958297, 28.1100643283236, 28.4480522935525, 28.6197629801695,
29.3756706791326, 29.9897540116082, 30.0057580759777, 30.7085039121469,
30.7510332074171, 30.9096299847723, 30.9314941316772, 31.3964098461468,
31.7312966347497, 32.2153802510432, 32.619498970957, 32.9731525439908,
33.3765950052407, 34.3397676884718, 35.1641816525327, 35.4891756054474,
36.2408178073905, 36.8733362068042, 37.6251560883057, 37.8047618263845,
37.8828752584342, 38.2001976403303, 38.9255502197319, 39.8027872575378,
40.0416710479264, 40.9954826039753, 41.4534375661544]


> ruby 1.8.2 (2004-12-25) [i386-mswin32]
> ["a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n",
> "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z", "aa"]
>
>    Here we have a string range that has 27 "members".  Now:

not quite - we have a string range that __produces__ 27 elements.  it
does not
'have' or 'contain' them.  it merely suggests this set as it's current
thought
on what that set might be.  this set definition may change - unlike the
endpoints of the range - and it is therefore not a property of the
range.

>    Can this really be called correct behavior of the member?() method?  I
>    can't see any tenable argument to say that it is.

the definition of membership may rely on endpoints only.  that explains
it
perfectly.

   harp:~ > irb
   irb(main):001:0>  'z' < 'aa'
   => false

ergo - not in the set.  the confustion here is caused by exactly the
reasons
i'm explaining - String#succ has been defined not to create a
monotonically
increasing (<=>) sequence - but to produce the "next" string in an
english
sense.  this is very useful for auto-generating names

   irb(main):004:0> "z99".succ
   => "aa00"

if this were a monotonically increasing set the output would be

   => "z9:"

but that sure isn't that useful - unless you want to try to use ranges
as
sets.

the secret here is simply re-define String#succ - not Range#member.  if
String#succ did a simply addition using base 255 arith you'd be set.

kind regards.

-a
discordantus (Guest)
on 2005-11-23 20:28
(Received via mailing list)
On 11/23/05, Ara.T.Howard <removed_email_address@domain.invalid> wrote:
> >> are in the set of strings.
>
>    - the spaceship operator for each succ value called against the endpoint
> in summary a range is nothing but a set of endpoints with some
>    end
> >    range, that doesn't match values that are in the set.  Wow, that sentence
> >    Here we have a string range that has 27 "members".  Now:
> > ...
>
>    => "aa00"
>
> if this were a monotonically increasing set the output would be
>
>    => "z9:"
>
> but that sure isn't that useful - unless you want to try to use ranges as
> sets.
>
> the secret here is simply re-define String#succ - not Range#member.  if
> String#succ did a simply addition using base 255 arith you'd be set.

Or, perhaps, re-define String#<=>. Who says Strings have to be
compared as if they were an array of bytes? It might be more in line
with peoples expectations if strings compared this way:

class String
  def <=>(other)
    dig, up, low = *%w/ \d+ [[:upper:]]+ [[:lower:]]+ /.map{|r|/#{r}/}
    re = /#{up}|#{low}|#{dig}/

    me, you = scan(re), other.scan(re)
    # uncomparable unless same format
    return nil unless me.size == you.size
    return nil unless me[1..-1].zip(you[1..-1]).all?{|a,b|a.size ==
b.size}

    # test starting with most significant chunks
    first = true
    me.zip(you) do |us, them|
      res = if us =~ dig and them =~ dig
        us.to_i <=> them.to_i
      elsif (us =~ up and them =~ up) or (us =~ low and them =~ low )
        us.to_i(36) <=> them.to_i(36)
      else
        # uncomparable
        nil
      end
      # if res.nil?, this chunk was uncomparable.
      # if res.zero?, these chunks were equal.
      return res if res.nil? or not res.zero?
    end
    return 0
  end
end
    ==>nil
('0a'..'10z').member? '5b2'
    ==>false
('0a0'..'10z9').member? '5b2'
    ==>true
('0a0'..'10z9').member? '5aa1'
    ==>false

I'm not saying this is the way it *should* be, just proposing another
possibility, for the sake of argument. Since Strings are text, some
people might expect them to be compared as text, instead of as a
series of byte values.

That was almost as fun as a RubyQuiz! :)

cheers,
Mark
This topic is locked and can not be replied to.