Forum: Ruby Re: string range membership

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
Dbda3094a2f929a0658bedd4e0787a56?d=identicon&s=25 warrenbrown (Guest)
on 2005-11-28 16:17
(Received via mailing list)
Matz,

> #include? used for range check, #member? was for set
> membership.  But since they have same functionality
> in Enumerable, some claimed having different
> behaviors in Range was confusing.  I agreed.
>
> All we need is making up good names for each
> functionality.

    OK, I think I see why they were changed to be the same, but I really
don't understand the choice of functionality that was kept.  For
everything except Ranges, #include? and #member? checks for set
membership.  In Ranges, #include? and #member? don't check for set
membership, they check for interval coverage instead.  This seems worse
than the original situation where at least #member? meant the same thing
everywhere.

    One other side note on the current names: "include" and "member" are
really opposite ideas.  A range includes a value, but a value is a
member of a range.  Having them mean exactly the same thing might also
be confusing.

    Anyway, could Range#include? and Range#member? be changed back to a
membership check and a new method be added to Range for interval
coverage, or would that break too much backwards compatibility?  Several
names come to mind for the new method: #between? (my personal favorite),
#betwixt? (kind of silly, but could be fun), #cover?, #surround?,
#bound?, #inside?, #within?, #in_range?, #in_interval?, #in?

    If the current behavior of the Range methods can't be changed, names
for membership checks (not including #member? - yuck!) could be:
#among?, #amid?, #amidst?, #component?, #constituent?, #part?, #has?,
#in?

    What do you think?

    - Warren Brown
0ec4920185b657a03edf01fff96b4e9b?d=identicon&s=25 matz (Guest)
on 2005-11-28 16:42
(Received via mailing list)
Hi,

In message "Re: [BUG] string range membership"
    on Tue, 29 Nov 2005 00:16:49 +0900, "Warren Brown"
<warrenbrown@aquire.com> writes:

|    OK, I think I see why they were changed to be the same, but I really
|don't understand the choice of functionality that was kept.  For
|everything except Ranges, #include? and #member? checks for set
|membership.  In Ranges, #include? and #member? don't check for set
|membership, they check for interval coverage instead.  This seems worse
|than the original situation where at least #member? meant the same thing
|everywhere.

I don't remember exactly but it's for the sake of performance.  I've
thinking about this issue for last few days, and it could be made
better by treating numbers specially, just like we did for min and max
in Range.

|    Anyway, could Range#include? and Range#member? be changed back to a
|membership check and a new method be added to Range for interval
|coverage, or would that break too much backwards compatibility?  Several
|names come to mind for the new method: #between? (my personal favorite),
|#betwixt? (kind of silly, but could be fun), #cover?, #surround?,
|#bound?, #inside?, #within?, #in_range?, #in_interval?, #in?
|
|    If the current behavior of the Range methods can't be changed, names
|for membership checks (not including #member? - yuck!) could be:
|#among?, #amid?, #amidst?, #component?, #constituent?, #part?, #has?,
|#in?
|
|    What do you think?

Thank you for the candidates.  I'd like to hear opinion from others
(especially from English speakers).

							matz.
912c61d9da47754de7039f4271334a9f?d=identicon&s=25 mental (Guest)
on 2005-11-28 16:54
(Received via mailing list)
Quoting Yukihiro Matsumoto <matz@ruby-lang.org>:

> In message "Re: [BUG] string range membership"
>     on Tue, 29 Nov 2005 00:16:49 +0900, "Warren Brown"
<warrenbrown@aquire.com> writes:
>
> |Several names come to mind for the new method: #between?
> |(my personal favorite), #betwixt? (kind of silly, but could
> |be fun), #cover?, #surround?, #bound?, #inside?, #within?,
> |#in_range?, #in_interval?, #in?
>
> Thank you for the candidates.  I'd like to hear opinion from
> others (especially from English speakers).

#within? seems best to me.

-mental
47b1910084592eb77a032bc7d8d1a84e?d=identicon&s=25 vjoel (Guest)
on 2005-11-28 17:06
(Received via mailing list)
mental@rydia.net wrote:
>>|be fun), #cover?, #surround?, #bound?, #inside?, #within?,
>>|#in_range?, #in_interval?, #in?
>>
>>Thank you for the candidates.  I'd like to hear opinion from
>>others (especially from English speakers).
>
>
> #within? seems best to me.
>
> -mental

I like #bound?, as in

(lower..upper).bound? x
1fba4539b6cafe2e60a2916fa184fc2f?d=identicon&s=25 dblack (Guest)
on 2005-11-28 17:10
(Received via mailing list)
Hi --

On Tue, 29 Nov 2005, Yukihiro Matsumoto wrote:

> |than the original situation where at least #member? meant the same thing
> |everywhere.
>
> I don't remember exactly but it's for the sake of performance.  I've
> thinking about this issue for last few days, and it could be made
> better by treating numbers specially, just like we did for min and max
> in Range.

I think that as long as ranges have all of this array/set behavior --
as long as range and range.to_a share so much functionality -- ranges
will always feel like two different objects.  The whole idea of
"membership" in a range has always seemed a little strange to me.  I
guess I think of ranges as very different from arrays and sets.

> |#in?
> |
> |    What do you think?
>
> Thank you for the candidates.  I'd like to hear opinion from others
> (especially from English speakers).

(0..5).to_a.include?(n) :-)

But seriously....  If it's a method of Range, then it has to be from
the range perspective, not the perspective of the argument.
#encompass? comes to mind.  There was an interesting discussion on IRC
about how to check for complete inclusion of one range in another.
#encompass?  could, ummm, encompass that:

   (0..5).encompass?(4)    # true
   (0..5).encompass?(5.1)  # false
   (0..5).encompass?(1..2) # true
   (1..2).encompass?(0..5) # false

etc.


David
912c61d9da47754de7039f4271334a9f?d=identicon&s=25 mental (Guest)
on 2005-11-28 17:26
(Received via mailing list)
Quoting Joel VanderWerf <vjoel@path.berkeley.edu>:

> I like #bound?, as in
>
> (lower..upper).bound? x

Hmm, I don't know.  That seems like it would suggest the existence
of a Range#bind ... #bounds? possibly?

-mental
47b1910084592eb77a032bc7d8d1a84e?d=identicon&s=25 vjoel (Guest)
on 2005-11-28 17:38
(Received via mailing list)
mental@rydia.net wrote:
>
> -mental

The "bound" form of the verb is more consistent with other predicates in
ruby: "include?" vs. "includes?", "exist?" vs. "exists?". (There are a
few ri hits for that last one, but they are marked as obsolete.)
2cf6d8e639314abd751f83a72e9a2ac5?d=identicon&s=25 martindemello (Guest)
on 2005-11-28 18:19
(Received via mailing list)
mental@rydia.net wrote:
> >
> > Thank you for the candidates.  I'd like to hear opinion from
> > others (especially from English speakers).
>
> #within? seems best to me.

That reads better as obj.within?(range) than as range.within?(obj). I
like #contain? personally, though in programming terms a "container" is
more a set than a range. Actually, I'm all for #include? to mean
bounding inclusion, and something a lot more expensive-sounding than
member? for #to_a set inclusion. Something like
"aaa"..."zzz".generates?("bbb") would at least indicate that it was
doing an O(n) stepthrough of the range.

martin
12271b6df73fe29930d65586be5a4a70?d=identicon&s=25 groups (Guest)
on 2005-12-01 01:05
(Received via mailing list)
English speaker.

Relative new Ruby user.

My thoughts: forget it! Stop! Ahhhh! Kitchen sink!

Let's see if I've got this straight. Somebody complained because

	('1'..'10').member?('2')
	=> false


Good! The fact that Ruby will get incredibly clever with strings and
fabricate arbitrary sequences with them is a charming trick, but they
are arbitrary, and it is a trick.

The fact that '1', '2', ... '9','10'  is obvious doesn't make it any
less arbitrary.

	'1'..'100'

Is that supposed to be 1, 2, 3, ... 99, 100 or 1, 10, 11, 100? Ruby
arbitrarily decided to interpret those strings as base 10 integers.

	'a.1'..'c.3'

Quite honestly, I have absolutely no idea how Ruby would count that.
Will I get 'a.1', 'a.2', 'a.3', 'b.1' ... or is it going to go all the
way to 'a.9' and then start over with 'b.1'?

Ruby does NOT need more almost-but-not-quite-the-same methods! People
sophisticated enough to require access to the subtle differences are
sophisticated enough to fix them problem themselves, by modifying the
necessary code, or by finding somebody else's recommended modification
and using that.

Please. The tremendous ease with which Ruby can be extended is all the
more reason to keep the core set tight, small, clean, and thus more
comprehensible and more accessible to beginners. I can't even count how
many different ways there are to open a file! I'd so much rather have
one way, with one thorough explanation, and notes on its shortcomings,
than seven, or whatever, each with just a sketchy description.

In closing:
	('1'..'10').to_a.member?('2')
	=> true

Is that really such a big deal?
C1bcb559f87f356698cfad9f6d630235?d=identicon&s=25 hal9000 (Guest)
on 2005-12-01 02:29
(Received via mailing list)
Dave Howell wrote:
>
> Good! The fact that Ruby will get incredibly clever with strings and
> fabricate arbitrary sequences with them is a charming trick, but they
> are arbitrary, and it is a trick.
>

That sums it up well. I couldn't have said it better. Well, maybe if I
really tried...


Hal
Cb48ca5059faf7409a5ab3745a964696?d=identicon&s=25 ara.t.howard (Guest)
on 2005-12-01 04:02
(Received via mailing list)
On Thu, 1 Dec 2005, Hal Fulton wrote:

> Dave Howell wrote:
>>
>> Good! The fact that Ruby will get incredibly clever with strings and
>> fabricate arbitrary sequences with them is a charming trick, but they are
>> arbitrary, and it is a trick.
>>
>
> That sums it up well. I couldn't have said it better. Well, maybe if I
> really tried...

otoh - __everything__ computers do can be summarized as extremely clever
tricks with strings.  maybe better said as infinitely long tapes, but
strings
nonetheless.

regards.

-a
45196398e9685000d195ec626d477f0e?d=identicon&s=25 transfire (Guest)
on 2005-12-01 05:31
(Received via mailing list)
> In closing:
>        ('1'..'10').to_a.member?('2')
>        => true
>
> Is that really such a big deal?

This is but one common instance. The issue extends beyond this. Range
is constrained by exceptional behaviors such that it can't be used more
creatively in an assured functioning manner. For instance, youd expect
#member? to do what the documentation says it does. If it not going to
do that the documentation ought to be changed. Why hasn't it? Becuase
Range has a great deal of exceptional behavior for the sake of
efficency. Documenting the actual behavior would be too complicated.

Given the current implementation of Range and how it is used (or more
percisely, how it can't be used), and since there is clearly no intetn
to do otherwise, it just doesn't make sense to push it as some sort of
generalized component. Might as well give Range all the knowledge it
needs to do its basic job and forget this "mixin nature" altogether
(i.e. #succ and #<=> of the sentinals). Finish coding Range to know
what an numeric sequence is and what a string sequence is and forget
about it (If you've ever looked at the code you know it's half way
there anyway). At least it would be even more efficient then.

T.
123320fdc17940dfc8e365edb48fbff2?d=identicon&s=25 bob_showalter (Guest)
on 2005-12-01 14:35
(Received via mailing list)
Dave Howell wrote:
>
>
> Good!

No, not good.

Range mixes Enumerable, but Range#member? does not behave like
Enumerable#member?, hence the confusion.

 >The fact that Ruby will get incredibly clever with strings and
> fabricate arbitrary sequences with them is a charming trick, but they
> are arbitrary, and it is a trick.
>
> The fact that '1', '2', ... '9','10'  is obvious doesn't make it any
> less arbitrary.

Yes, it is arbitrary, but nevertheless, the range '1'..'10' will produce
the value '2' (unless String#succ has been overridden), so '2' is by any
ordindary definition of the word member, a member of this range.

It sounds more like your beef is with String#succ

 > I'd so much rather have
> one way, with one thorough explanation, and notes on its shortcomings,
> than seven, or whatever, each with just a sketchy description.

I would too, namely make Enumerable#member? work the same way for Ranges
that it does for any other Enumerable. (That seems to be where matz is
leaning).

> ...
> In closing:
>     ('1'..'10').to_a.member?('2')
>     => true
>
> Is that really such a big deal?
>

No, that's fine. More efficient would be

   !('1'..'10').find({|x| x == '2'}).nil?
1fba4539b6cafe2e60a2916fa184fc2f?d=identicon&s=25 dblack (Guest)
on 2005-12-01 14:43
(Received via mailing list)
Hi --

On Thu, 1 Dec 2005, Bob Showalter wrote:

>>     => false
>> fabricate arbitrary sequences with them is a charming trick, but they are
>> arbitrary, and it is a trick.
>>
>> The fact that '1', '2', ... '9','10'  is obvious doesn't make it any less
>> arbitrary.
>
> Yes, it is arbitrary, but nevertheless, the range '1'..'10' will produce the
> value '2' (unless String#succ has been overridden), so '2' is by any
> ordindary definition of the word member, a member of this range.

I think it's more a question of the definition of the word "range".
I've come to believe that ranges should be strictly interval-like in
their behavior.  Basically, a range is a kind of filter: (0...10) is
not ten numbers, but rather an expression of "the fact of
0-through-10-ness", or something like that.  At least, that's how I'd
like to see ranges work.  I think they're trying to be too many
things at once.

This is also why I think the idea of a mutable range is a
contradiction in terms.  You can't change what the fact of starting at
0 and ending at 10 means.


David

>>     ('1'..'10').to_a.member?('2')
>>     => true
>>
>> Is that really such a big deal?
>>
>
> No, that's fine. More efficient would be
>
>  !('1'..'10').find({|x| x == '2'}).nil?

I don't think find takes a hash argument :-)  Also, why not lose the !
and the .nil? ?


David
123320fdc17940dfc8e365edb48fbff2?d=identicon&s=25 bob_showalter (Guest)
on 2005-12-01 14:59
(Received via mailing list)
David A. Black wrote:
>> No, that's fine. More efficient would be
>>
>>  !('1'..'10').find({|x| x == '2'}).nil?
>
>
> I don't think find takes a hash argument :-)  Also, why not lose the !
> and the .nil? ?

a) The fingers get ahead of the brain :)
b) To get a true/false. But not needed, of course.
1fba4539b6cafe2e60a2916fa184fc2f?d=identicon&s=25 dblack (Guest)
on 2005-12-01 15:24
(Received via mailing list)
Hi --

On Thu, 1 Dec 2005, Bob Showalter wrote:

> b) To get a true/false. But not needed, of course.
You can do:

   enum.any? {|e| ... }

to get a true/false result.


David
123320fdc17940dfc8e365edb48fbff2?d=identicon&s=25 bob_showalter (Guest)
on 2005-12-01 16:49
(Received via mailing list)
David A. Black wrote:
> You can do:
>
>   enum.any? {|e| ... }
>
> to get a true/false result.

Excellent! I assume that would stop iterating as soon as a true was
found?
1fba4539b6cafe2e60a2916fa184fc2f?d=identicon&s=25 dblack (Guest)
on 2005-12-01 16:54
(Received via mailing list)
Hi --

On Fri, 2 Dec 2005, Bob Showalter wrote:

> David A. Black wrote:
>> You can do:
>>
>>   enum.any? {|e| ... }
>>
>> to get a true/false result.
>
> Excellent! I assume that would stop iterating as soon as a true was found?

Yes:

irb(main):004:0> [1,2,3,4].any? {|e| puts e; e > 1 }
1
2
=> true


David
Ddbfebb47432f6599da361df6a135c7c?d=identicon&s=25 adam.shelly (Guest)
on 2005-12-02 07:58
(Received via mailing list)
On 12/1/05, Trans <transfire@gmail.com> wrote:
> stew around in their own preconceptions?
>
I just saw this quote in pickaxe, which helped clarify my thinking:

"Ranges can be constructed using objects of any type, as long as the
objects can be compared using their <=> operator and they support the
succ method to return the next object in sequence. "

But a<=>a.succ != -1 for all a, especially if a is a String.
This causes r.find{|a| !r.member?(a)} to return non-nil for some
ranges, which is unexpected, or possibly just plain wrong.

So I think I like your suggestion, which I would boil down to:
Change the requirement as follows: "Ranges can be constructed from
objects of any class which supports #succ and #cmp, where
a.cmp(a.succ)==-1 for all a."

For numeric classes #cmp is an alias for #<=>.  For strings #cmp is a
custom function, matching the string succession generator.
And for your own classes you can write your own #cmp, which does not
have to match #<=>.  For instance:
President.new("Kennedy").cmp President.new("Nixon") #=> -1  (Nixon came
later)
President.new("Kennedy")<=>  President.new("Nixon") #=> 1  (but
Kennedy is greater)


The issue I see is that I don' t know if it is possible to write a
valid #cmp for all cases.
What is the result of 'a'.cmp('0') ?    You can't ever get a '0' with
'a'.succ.  So is '0' before or after 'a'?

I suppose that you could require that a.cmp b returns nil when
a=a.succ will never produce b.  Then Range#member? becomes a test for
set membership, just like enum#member?:
class Range
  def member? v
    (f=first.cmp(v) && f<=0  && l=last.cmp(v) && l>=0)
  end
end

So yes, I like this suggestion.
And I just realized it it may not be orthogonal to the other:
 The suggested change leaves no way to test for Interval inclusion in
those cases where the interval and the sequence are different (like
'a'..'bb').   So perhaps there still should be a method of testing
Range inclusion using <=>. (my suggestion is #spans?)
('a'..'bb').member? 'z' #=> true
('a'..'bb').spans? 'z' #=> false
('a'..'bb').member? 'aardvark' #=> false
('a'..'bb').spans? 'aardvark' #=> true

-Adam
45196398e9685000d195ec626d477f0e?d=identicon&s=25 transfire (Guest)
on 2005-12-03 14:06
(Received via mailing list)
Adam Shelly wrote:
> So I think I like your suggestion, which I would boil down to:
> Change the requirement as follows: "Ranges can be constructed from
> objects of any class which supports #succ and #cmp, where
> a.cmp(a.succ)==-1 for all a."

Very nicely put. I wish I were as gifted at explaining things. Thanks
Adam.

> For numeric classes #cmp is an alias for #<=>.  For strings #cmp is a
> custom function, matching the string succession generator.
> And for your own classes you can write your own #cmp, which does not
> have to match #<=>.  For instance:
> President.new("Kennedy").cmp President.new("Nixon") #=> -1  (Nixon came later)
> President.new("Kennedy")<=>  President.new("Nixon") #=> 1  (but
> Kennedy is greater)

:-)

>     (f=first.cmp(v) && f<=0  && l=last.cmp(v) && l>=0)
>   end
> end

Yes that's exactly it. The nice thing about having #cmp seperate from
#<=> is you can have optimizations to determing this, and it can be
uses by the #member? method.

> So yes, I like this suggestion.
> And I just realized it it may not be orthogonal to the other:
>  The suggested change leaves no way to test for Interval inclusion in
> those cases where the interval and the sequence are different (like
> 'a'..'bb').   So perhaps there still should be a method of testing
> Range inclusion using <=>. (my suggestion is #spans?)
> ('a'..'bb').member? 'z' #=> true
> ('a'..'bb').spans? 'z' #=> false
> ('a'..'bb').member? 'aardvark' #=> false
> ('a'..'bb').spans? 'aardvark' #=> true

Good point. Boundry checks with #<=>, irregardelss of membership, would
still be useful.

Thanks Adam.

T.
This topic is locked and can not be replied to.