Re: string range membership


#1

Matz,

#include? used for range check, #member? was for set
membership. But since they have same functionality
in Enumerable, some claimed having different
behaviors in Range was confusing. I agreed.

All we need is making up good names for each
functionality.

OK, I think I see why they were changed to be the same, but I really

don’t understand the choice of functionality that was kept. For
everything except Ranges, #include? and #member? checks for set
membership. In Ranges, #include? and #member? don’t check for set
membership, they check for interval coverage instead. This seems worse
than the original situation where at least #member? meant the same thing
everywhere.

One other side note on the current names: "include" and "member" are

really opposite ideas. A range includes a value, but a value is a
member of a range. Having them mean exactly the same thing might also
be confusing.

Anyway, could Range#include? and Range#member? be changed back to a

membership check and a new method be added to Range for interval
coverage, or would that break too much backwards compatibility? Several
names come to mind for the new method: #between? (my personal favorite),
#betwixt? (kind of silly, but could be fun), #cover?, #surround?,
#bound?, #inside?, #within?, #in_range?, #in_interval?, #in?

If the current behavior of the Range methods can't be changed, names

for membership checks (not including #member? - yuck!) could be:
#among?, #amid?, #amidst?, #component?, #constituent?, #part?, #has?,
#in?

What do you think?

- Warren B.

#2

Quoting Yukihiro M. removed_email_address@domain.invalid:

In message “Re: [BUG] string range membership”
on Tue, 29 Nov 2005 00:16:49 +0900, “Warren B.”
removed_email_address@domain.invalid writes:

|Several names come to mind for the new method: #between?
|(my personal favorite), #betwixt? (kind of silly, but could
|be fun), #cover?, #surround?, #bound?, #inside?, #within?,
|#in_range?, #in_interval?, #in?

Thank you for the candidates. I’d like to hear opinion from
others (especially from English speakers).

#within? seems best to me.

-mental


#3

Hi,

In message “Re: [BUG] string range membership”
on Tue, 29 Nov 2005 00:16:49 +0900, “Warren B.”
removed_email_address@domain.invalid writes:

| OK, I think I see why they were changed to be the same, but I really
|don’t understand the choice of functionality that was kept. For
|everything except Ranges, #include? and #member? checks for set
|membership. In Ranges, #include? and #member? don’t check for set
|membership, they check for interval coverage instead. This seems worse
|than the original situation where at least #member? meant the same thing
|everywhere.

I don’t remember exactly but it’s for the sake of performance. I’ve
thinking about this issue for last few days, and it could be made
better by treating numbers specially, just like we did for min and max
in Range.

| Anyway, could Range#include? and Range#member? be changed back to a
|membership check and a new method be added to Range for interval
|coverage, or would that break too much backwards compatibility? Several
|names come to mind for the new method: #between? (my personal favorite),
|#betwixt? (kind of silly, but could be fun), #cover?, #surround?,
|#bound?, #inside?, #within?, #in_range?, #in_interval?, #in?
|
| If the current behavior of the Range methods can’t be changed, names
|for membership checks (not including #member? - yuck!) could be:
|#among?, #amid?, #amidst?, #component?, #constituent?, #part?, #has?,
|#in?
|
| What do you think?

Thank you for the candidates. I’d like to hear opinion from others
(especially from English speakers).

						matz.

#4

removed_email_address@domain.invalid wrote:

|be fun), #cover?, #surround?, #bound?, #inside?, #within?,
|#in_range?, #in_interval?, #in?

Thank you for the candidates. I’d like to hear opinion from
others (especially from English speakers).

#within? seems best to me.

-mental

I like #bound?, as in

(lower…upper).bound? x


#5

Hi –

On Tue, 29 Nov 2005, Yukihiro M. wrote:

|than the original situation where at least #member? meant the same thing
|everywhere.

I don’t remember exactly but it’s for the sake of performance. I’ve
thinking about this issue for last few days, and it could be made
better by treating numbers specially, just like we did for min and max
in Range.

I think that as long as ranges have all of this array/set behavior –
as long as range and range.to_a share so much functionality – ranges
will always feel like two different objects. The whole idea of
“membership” in a range has always seemed a little strange to me. I
guess I think of ranges as very different from arrays and sets.

|#in?
|
| What do you think?

Thank you for the candidates. I’d like to hear opinion from others
(especially from English speakers).

(0…5).to_a.include?(n) :slight_smile:

But seriously… If it’s a method of Range, then it has to be from
the range perspective, not the perspective of the argument.
#encompass? comes to mind. There was an interesting discussion on IRC
about how to check for complete inclusion of one range in another.
#encompass? could, ummm, encompass that:

(0…5).encompass?(4) # true
(0…5).encompass?(5.1) # false
(0…5).encompass?(1…2) # true
(1…2).encompass?(0…5) # false

etc.

David


#6

removed_email_address@domain.invalid wrote:

-mental

The “bound” form of the verb is more consistent with other predicates in
ruby: “include?” vs. “includes?”, “exist?” vs. “exists?”. (There are a
few ri hits for that last one, but they are marked as obsolete.)


#7

removed_email_address@domain.invalid wrote:

Thank you for the candidates. I’d like to hear opinion from
others (especially from English speakers).

#within? seems best to me.

That reads better as obj.within?(range) than as range.within?(obj). I
like #contain? personally, though in programming terms a “container” is
more a set than a range. Actually, I’m all for #include? to mean
bounding inclusion, and something a lot more expensive-sounding than
member? for #to_a set inclusion. Something like
“aaa”…“zzz”.generates?(“bbb”) would at least indicate that it was
doing an O(n) stepthrough of the range.

martin


#8

Quoting Joel VanderWerf removed_email_address@domain.invalid:

I like #bound?, as in

(lower…upper).bound? x

Hmm, I don’t know. That seems like it would suggest the existence
of a Range#bind … #bounds? possibly?

-mental


#9

English speaker.

Relative new Ruby user.

My thoughts: forget it! Stop! Ahhhh! Kitchen sink!

Let’s see if I’ve got this straight. Somebody complained because

('1'..'10').member?('2')
=> false

Good! The fact that Ruby will get incredibly clever with strings and
fabricate arbitrary sequences with them is a charming trick, but they
are arbitrary, and it is a trick.

The fact that ‘1’, ‘2’, … ‘9’,‘10’ is obvious doesn’t make it any
less arbitrary.

'1'..'100'

Is that supposed to be 1, 2, 3, … 99, 100 or 1, 10, 11, 100? Ruby
arbitrarily decided to interpret those strings as base 10 integers.

'a.1'..'c.3'

Quite honestly, I have absolutely no idea how Ruby would count that.
Will I get ‘a.1’, ‘a.2’, ‘a.3’, ‘b.1’ … or is it going to go all the
way to ‘a.9’ and then start over with ‘b.1’?

Ruby does NOT need more almost-but-not-quite-the-same methods! People
sophisticated enough to require access to the subtle differences are
sophisticated enough to fix them problem themselves, by modifying the
necessary code, or by finding somebody else’s recommended modification
and using that.

Please. The tremendous ease with which Ruby can be extended is all the
more reason to keep the core set tight, small, clean, and thus more
comprehensible and more accessible to beginners. I can’t even count how
many different ways there are to open a file! I’d so much rather have
one way, with one thorough explanation, and notes on its shortcomings,
than seven, or whatever, each with just a sketchy description.

In closing:
(‘1’…‘10’).to_a.member?(‘2’)
=> true

Is that really such a big deal?


#10

Dave H. wrote:

Good! The fact that Ruby will get incredibly clever with strings and
fabricate arbitrary sequences with them is a charming trick, but they
are arbitrary, and it is a trick.

That sums it up well. I couldn’t have said it better. Well, maybe if I
really tried…

Hal


#11

In closing:
(‘1’…‘10’).to_a.member?(‘2’)
=> true

Is that really such a big deal?

This is but one common instance. The issue extends beyond this. Range
is constrained by exceptional behaviors such that it can’t be used more
creatively in an assured functioning manner. For instance, youd expect
#member? to do what the documentation says it does. If it not going to
do that the documentation ought to be changed. Why hasn’t it? Becuase
Range has a great deal of exceptional behavior for the sake of
efficency. Documenting the actual behavior would be too complicated.

Given the current implementation of Range and how it is used (or more
percisely, how it can’t be used), and since there is clearly no intetn
to do otherwise, it just doesn’t make sense to push it as some sort of
generalized component. Might as well give Range all the knowledge it
needs to do its basic job and forget this “mixin nature” altogether
(i.e. #succ and #<=> of the sentinals). Finish coding Range to know
what an numeric sequence is and what a string sequence is and forget
about it (If you’ve ever looked at the code you know it’s half way
there anyway). At least it would be even more efficient then.

T.


#12

On Thu, 1 Dec 2005, Hal F. wrote:

Dave H. wrote:

Good! The fact that Ruby will get incredibly clever with strings and
fabricate arbitrary sequences with them is a charming trick, but they are
arbitrary, and it is a trick.

That sums it up well. I couldn’t have said it better. Well, maybe if I
really tried…

otoh - everything computers do can be summarized as extremely clever
tricks with strings. maybe better said as infinitely long tapes, but
strings
nonetheless.

regards.

-a


#13

Dave H. wrote:

Good!

No, not good.

Range mixes Enumerable, but Range#member? does not behave like
Enumerable#member?, hence the confusion.

The fact that Ruby will get incredibly clever with strings and
fabricate arbitrary sequences with them is a charming trick, but they
are arbitrary, and it is a trick.

The fact that ‘1’, ‘2’, … ‘9’,‘10’ is obvious doesn’t make it any
less arbitrary.

Yes, it is arbitrary, but nevertheless, the range ‘1’…‘10’ will produce
the value ‘2’ (unless String#succ has been overridden), so ‘2’ is by any
ordindary definition of the word member, a member of this range.

It sounds more like your beef is with String#succ

I’d so much rather have
one way, with one thorough explanation, and notes on its shortcomings,
than seven, or whatever, each with just a sketchy description.

I would too, namely make Enumerable#member? work the same way for Ranges
that it does for any other Enumerable. (That seems to be where matz is
leaning).


In closing:
(‘1’…‘10’).to_a.member?(‘2’)
=> true

Is that really such a big deal?

No, that’s fine. More efficient would be

!(‘1’…‘10’).find({|x| x == ‘2’}).nil?


#14

Hi –

On Thu, 1 Dec 2005, Bob S. wrote:

=> false

fabricate arbitrary sequences with them is a charming trick, but they are
arbitrary, and it is a trick.

The fact that ‘1’, ‘2’, … ‘9’,‘10’ is obvious doesn’t make it any less
arbitrary.

Yes, it is arbitrary, but nevertheless, the range ‘1’…‘10’ will produce the
value ‘2’ (unless String#succ has been overridden), so ‘2’ is by any
ordindary definition of the word member, a member of this range.

I think it’s more a question of the definition of the word “range”.
I’ve come to believe that ranges should be strictly interval-like in
their behavior. Basically, a range is a kind of filter: (0…10) is
not ten numbers, but rather an expression of “the fact of
0-through-10-ness”, or something like that. At least, that’s how I’d
like to see ranges work. I think they’re trying to be too many
things at once.

This is also why I think the idea of a mutable range is a
contradiction in terms. You can’t change what the fact of starting at
0 and ending at 10 means.

David

('1'..'10').to_a.member?('2')
=> true

Is that really such a big deal?

No, that’s fine. More efficient would be

!(‘1’…‘10’).find({|x| x == ‘2’}).nil?

I don’t think find takes a hash argument :slight_smile: Also, why not lose the !
and the .nil? ?

David


#15

Hi –

On Thu, 1 Dec 2005, Bob S. wrote:

b) To get a true/false. But not needed, of course.
You can do:

enum.any? {|e| … }

to get a true/false result.

David


#16

David A. Black wrote:

You can do:

enum.any? {|e| … }

to get a true/false result.

Excellent! I assume that would stop iterating as soon as a true was
found?


#17

Hi –

On Fri, 2 Dec 2005, Bob S. wrote:

David A. Black wrote:

You can do:

enum.any? {|e| … }

to get a true/false result.

Excellent! I assume that would stop iterating as soon as a true was found?

Yes:

irb(main):004:0> [1,2,3,4].any? {|e| puts e; e > 1 }
1
2
=> true

David


#18

David A. Black wrote:

No, that’s fine. More efficient would be

!(‘1’…‘10’).find({|x| x == ‘2’}).nil?

I don’t think find takes a hash argument :slight_smile: Also, why not lose the !
and the .nil? ?

a) The fingers get ahead of the brain :slight_smile:
b) To get a true/false. But not needed, of course.


#19

On 12/1/05, Trans removed_email_address@domain.invalid wrote:

stew around in their own preconceptions?

I just saw this quote in pickaxe, which helped clarify my thinking:

"Ranges can be constructed using objects of any type, as long as the
objects can be compared using their <=> operator and they support the
succ method to return the next object in sequence. "

But a<=>a.succ != -1 for all a, especially if a is a String.
This causes r.find{|a| !r.member?(a)} to return non-nil for some
ranges, which is unexpected, or possibly just plain wrong.

So I think I like your suggestion, which I would boil down to:
Change the requirement as follows: “Ranges can be constructed from
objects of any class which supports #succ and #cmp, where
a.cmp(a.succ)==-1 for all a.”

For numeric classes #cmp is an alias for #<=>. For strings #cmp is a
custom function, matching the string succession generator.
And for your own classes you can write your own #cmp, which does not
have to match #<=>. For instance:
President.new(“Kennedy”).cmp President.new(“Nixon”) #=> -1 (Nixon came
later)
President.new(“Kennedy”)<=> President.new(“Nixon”) #=> 1 (but
Kennedy is greater)

The issue I see is that I don’ t know if it is possible to write a
valid #cmp for all cases.
What is the result of ‘a’.cmp(‘0’) ? You can’t ever get a ‘0’ with
‘a’.succ. So is ‘0’ before or after ‘a’?

I suppose that you could require that a.cmp b returns nil when
a=a.succ will never produce b. Then Range#member? becomes a test for
set membership, just like enum#member?:
class Range
def member? v
(f=first.cmp(v) && f<=0 && l=last.cmp(v) && l>=0)
end
end

So yes, I like this suggestion.
And I just realized it it may not be orthogonal to the other:
The suggested change leaves no way to test for Interval inclusion in
those cases where the interval and the sequence are different (like
‘a’…‘bb’). So perhaps there still should be a method of testing
Range inclusion using <=>. (my suggestion is #spans?)
(‘a’…‘bb’).member? ‘z’ #=> true
(‘a’…‘bb’).spans? ‘z’ #=> false
(‘a’…‘bb’).member? ‘aardvark’ #=> false
(‘a’…‘bb’).spans? ‘aardvark’ #=> true

-Adam


#20

Adam S. wrote:

So I think I like your suggestion, which I would boil down to:
Change the requirement as follows: “Ranges can be constructed from
objects of any class which supports #succ and #cmp, where
a.cmp(a.succ)==-1 for all a.”

Very nicely put. I wish I were as gifted at explaining things. Thanks
Adam.

For numeric classes #cmp is an alias for #<=>. For strings #cmp is a
custom function, matching the string succession generator.
And for your own classes you can write your own #cmp, which does not
have to match #<=>. For instance:
President.new(“Kennedy”).cmp President.new(“Nixon”) #=> -1 (Nixon came later)
President.new(“Kennedy”)<=> President.new(“Nixon”) #=> 1 (but
Kennedy is greater)

:slight_smile:

(f=first.cmp(v) && f<=0  && l=last.cmp(v) && l>=0)

end
end

Yes that’s exactly it. The nice thing about having #cmp seperate from
#<=> is you can have optimizations to determing this, and it can be
uses by the #member? method.

So yes, I like this suggestion.
And I just realized it it may not be orthogonal to the other:
The suggested change leaves no way to test for Interval inclusion in
those cases where the interval and the sequence are different (like
‘a’…‘bb’). So perhaps there still should be a method of testing
Range inclusion using <=>. (my suggestion is #spans?)
(‘a’…‘bb’).member? ‘z’ #=> true
(‘a’…‘bb’).spans? ‘z’ #=> false
(‘a’…‘bb’).member? ‘aardvark’ #=> false
(‘a’…‘bb’).spans? ‘aardvark’ #=> true

Good point. Boundry checks with #<=>, irregardelss of membership, would
still be useful.

Thanks Adam.

T.