Semantics of 1.9.2 Set

luislavena · March 1, 2011, 5:46pm

I’m looking at the doc for the 1.9.2 Set class, which says (http://
www.ruby-doc.org/stdlib/libdoc/set/rdoc/classes/Set.html):

Set implements a collection of unordered values with no duplicates.
This is a
hybrid of Arrays intuitive inter-operation facilities and Hashs
fast lookup.

The equality of each couple of elements is determined according to
Object#eql?
and Object#hash, since Set uses Hash as storage.

Since the doc explicitly defines Hash as the underlying store, and
since Hash in 1.9.2 observes insertion order (RDoc Documentation
classes/Hash.html) during enumeration, I wonder if the doc for Set
should be changed?

john_old · March 4, 2011, 7:23pm

On Tue, Mar 1, 2011 at 4:45 PM, John [email protected] wrote:

Since the doc explicitly defines Hash as the underlying store, and
since Hash in 1.9.2 observes insertion order (RDoc Documentation
classes/Hash.html) during enumeration, I wonder if the doc for Set
should be changed?

I was hoping someone who had a detailed knowledge of Set and Hash
would give you an answer. Since I don’t qualify on the detailed
knowledge grounds, I’ll try my knowledge of logic, and hope that fits.

Set implements a collection of unordered values with no duplicates,
which means that you shouldn’t assume the elements will be in a
particular order (alphabetical, time of insertion into the set,etc) if
you iterate over the set using each: that’s important.

But if the current implementation happens to use another class (Hash)
which does have an order (time of insertion), then that’s not
important. Code might use the fact that using Set with its current
use of Hash means that the iteration order is time of first insertion
into the set (ignoring complications of inserting, then deleting, then
reinserting an element).

But given the actual documentation of Set, it would be unwise to do
this: a later implementation of Set might use something other than an
insertion order preserving Hash.

In short, I think the documentation for Set should remain as it is. Or
possibly say explicitly that whilst the current implementation of Set
uses Hash for storage, it should not be assumed that this will always
be the case.

john_old · March 4, 2011, 7:43pm

On Fri, Mar 4, 2011 at 6:22 PM, Colin B.
[email protected]wrote:

this: a later implementation of Set might use something other than an
insertion order preserving Hash.

In short, I think the documentation for Set should remain as it is. Or
possibly say explicitly that whilst the current implementation of Set
uses Hash for storage, it should not be assumed that this will always
be the case.

I would agree that the documentation should not mention any order, as
that’s
not what defines a Set, it’s just a leftover from an implementation with
Hashes. I don’t think it should be stressed that its current
implementation
is a Hash, in the documentation, though. If it’s included, it definitely
should be pointed out that the implementation with a Hash is not a
normative
restriction.

I was hoping someone who had a detailed knowledge of Set and Hash

would give you an answer. Since I don’t qualify on the detailed
knowledge grounds, I’ll try my knowledge of logic, and hope that fits.

I think you responded quite well, for what it’s worth.