Enumerator for Hash

rpheath · May 23, 2007, 5:01am

I have a complex object that I want to hash in different ways, similar
to the way Enumerator allows for iterating using something besides
#each. For one Hash I can use the #hash method for indexing, but for
another hash I want to use some other method. I cannot find an easy
way to do this.

Any suggestions?

rpheath · May 23, 2007, 9:45am

On 23.05.2007 04:59, Ryan H. wrote:

I have a complex object that I want to hash in different ways,

You write “hash” which usually means “calculate hash value for” but the
following text seems to indicate that you want to iterate.

similar
to the way Enumerator allows for iterating using something besides
#each. For one Hash I can use the #hash method for indexing,

Now you mention “indexing” and #hash method in one sentence.

but for
another hash I want to use some other method. I cannot find an easy
way to do this.

Frankly, I am at a loss here. What do you want? (I have some possible
questions in mind but I’d rather hear it from you.)

Any suggestions?

Make clear what you actually want. You could also show some code.

Kind regards

robert

rpheath · May 23, 2007, 6:00pm

On May 23, 3:41 am, Robert K. [email protected] wrote:

Make clear what you actually want. You could also show some code.

Sorry for the confusion. I was trying to be considerate and brief,
but apparently I was only obscure.

I want a Hash-like object that instead of calling each object’s #hash
method to get its hash value, calls another to-be-specified method. I
probably want to specify an alternative to #eql? as well. Again, this
is parallel to the way the standard library Enumerator class creates
an Enumerable-like object with an alternative #each method.

One use cases is uniqueness.

existing array with a bunch of complex objects

my_array = […]

find unique objects using normal #hash and #eql? methods;

yes, I know Array#uniq does this – it’s an example

normal_uniq_hash = Hash.new
my_array.each {|obj| normal_uniq_hash[obj] = obj}

now normal_uniq_hash can be iterated to get the unique objects from

my_array

find “unique” objects using different #hash and #eql? methods

alt_uniq_hash = AltHash.new(:alt_hash, :alt_eql?) # postulating a new
AltHash
my_array.each {|obj| alt_uniq_hash[obj] = obj}

now alt_uniq_hash can be iterated to get the “unique” objects from

my_array

Is this more clear?

Thanks for your help!

rpheath · May 23, 2007, 6:02pm

On May 23, 3:41 am, Robert K. [email protected] wrote:

Make clear what you actually want. You could also show some code.

Sorry for the confusion. I was trying to be considerate and brief,
but apparently I was only obscure.

I want a Hash-like object that instead of calling each object’s #hash
method to get its hash value, calls another to-be-specified method. I
probably want to specify an alternative to #eql? as well. Again, this
is parallel to the way the standard library Enumerator class creates
an Enumerable-like object with an alternative #each method.

One use cases is uniqueness.

existing array with a bunch of complex objects

my_array = […]

find unique objects using normal #hash and #eql? methods;

yes, I know Array#uniq does this – it’s an example

normal_uniq_hash = Hash.new
my_array.each {|obj| normal_uniq_hash[obj] = obj}

now normal_uniq_hash can be iterated to get the unique objects from

my_array

find “unique” objects using different #hash and #eql? methods

alt_uniq_hash = AltHash.new(:alt_hash, :alt_eql?) # postulating a new
AltHash
my_array.each {|obj| alt_uniq_hash[obj] = obj}

now alt_uniq_hash can be iterated to get the “unique” objects from

my_array

Is this more clear?

Thanks for your help!

rpheath · May 23, 2007, 6:05pm

On May 23, 3:41 am, Robert K. [email protected] wrote:

Make clear what you actually want. You could also show some code.

Sorry for the confusion. I was trying to be considerate and brief,
but apparently I was only obscure.

I want a Hash-like object that instead of calling each object’s #hash
method to get its hash value, calls another to-be-specified method. I
probably want to specify an alternative to #eql? as well. Again, this
is parallel to the way the standard library Enumerator class creates
an Enumerable-like object with an alternative #each method.

One use cases is uniqueness.

existing array with a bunch of complex objects

my_array = […]

find unique objects using normal #hash and #eql? methods;

yes, I know Array#uniq does this – it’s an example

normal_uniq_hash = Hash.new
my_array.each {|obj| normal_uniq_hash[obj] = obj}

now normal_uniq_hash can be iterated to get the unique objects from

my_array

find “unique” objects using different #hash and #eql? methods

alt_uniq_hash = AltHash.new(:alt_hash, :alt_eql?) # postulating a new
AltHash
my_array.each {|obj| alt_uniq_hash[obj] = obj}

now alt_uniq_hash can be iterated to get the “unique” objects from

my_array

Is this more clear?

Thanks for your help!

rpheath · May 23, 2007, 6:08pm

On May 23, 3:41 am, Robert K. [email protected] wrote:

Make clear what you actually want. You could also show some code.

Sorry for the confusion. I was trying to be considerate and brief,
but apparently I was only obscure.

I want a Hash-like object that instead of calling each object’s #hash
method to get its hash value, calls another to-be-specified method. I
probably want to specify an alternative to #eql? as well. Again, this
is parallel to the way the standard library Enumerator class creates
an Enumerable-like object with an alternative #each method.

One use cases is uniqueness.

existing array with a bunch of complex objects

my_array = […]

find unique objects using normal #hash and #eql? methods;

yes, I know Array#uniq does this – it’s an example

normal_uniq_hash = Hash.new
my_array.each {|obj| normal_uniq_hash[obj] = obj}

now normal_uniq_hash can be iterated to get the unique objects from

my_array

find “unique” objects using different #hash and #eql? methods

alt_uniq_hash = AltHash.new(:alt_hash, :alt_eql?) # postulating a new
AltHash
my_array.each {|obj| alt_uniq_hash[obj] = obj}

now alt_uniq_hash can be iterated to get the “unique” objects from

my_array

Is this more clear?

Thanks for your help!

rpheath · May 23, 2007, 6:20pm

Wow! 5 identical postings in just two minutes. You must have been
hitting “send” really hard…

On 23.05.2007 18:00, Ryan H. wrote:

is parallel to the way the standard library Enumerator class creates
an Enumerable-like object with an alternative #each method.

You could wrap your keys in a custom class. For added convenience you
could also wrap Hash with another class which wraps keys on insertion
and unwraps on reading.

my_array

find “unique” objects using different #hash and #eql? methods

alt_uniq_hash = AltHash.new(:alt_hash, :alt_eql?) # postulating a new
AltHash
my_array.each {|obj| alt_uniq_hash[obj] = obj}

now alt_uniq_hash can be iterated to get the “unique” objects from

my_array

In that case I’d probably rather do this:

my_array.inject({}) {|hs, obj| hs[obj.key] ||= obj; hs }.values

For example - using a string’s length as uniqueness criterion:

irb(main):001:0> my_array = %w{foo bar dodo dida longword}
=> [“foo”, “bar”, “dodo”, “dida”, “longword”]
irb(main):002:0> my_array.inject({}) {|hs, obj| hs[obj.size] ||= obj; hs
}.values
=> [“longword”, “foo”, “dodo”]

Is this more clear?

Yes.

Thanks for your help!

You’re welcome. And please be easy on the trigger.

Kind regards

robert

rpheath · May 23, 2007, 8:15pm

On May 23, 12:18 pm, Robert K. [email protected] wrote:

Wow! 5 identical postings in just two minutes. You must have been
hitting “send” really hard…

Sorry. I’m using Google groups and it keeps telling me “error, unable
to send your post.” I should have checked back. You may be getting
email to the same effect, but it told me that operation had an error
as well. I am very sorry for the spew.

rpheath · May 23, 2007, 6:06pm

On May 23, 3:41 am, Robert K. [email protected] wrote:

Make clear what you actually want. You could also show some code.

Sorry for the confusion. I was trying to be considerate and brief,
but apparently I was only obscure.

I want a Hash-like object that instead of calling each object’s #hash
method to get its hash value, calls another to-be-specified method. I
probably want to specify an alternative to #eql? as well. Again, this
is parallel to the way the standard library Enumerator class creates
an Enumerable-like object with an alternative #each method.

One use cases is uniqueness.

existing array with a bunch of complex objects

my_array = […]

find unique objects using normal #hash and #eql? methods;

yes, I know Array#uniq does this – it’s an example

normal_uniq_hash = Hash.new
my_array.each {|obj| normal_uniq_hash[obj] = obj}

now normal_uniq_hash can be iterated to get the unique objects from

my_array

find “unique” objects using different #hash and #eql? methods

alt_uniq_hash = AltHash.new(:alt_hash, :alt_eql?) # postulating a new
AltHash
my_array.each {|obj| alt_uniq_hash[obj] = obj}

now alt_uniq_hash can be iterated to get the “unique” objects from

my_array

Is this more clear?

Thanks for your help!

rpheath · May 23, 2007, 8:20pm

On May 23, 12:18 pm, Robert K. [email protected] wrote:

…

You could wrap your keys in a custom class. For added convenience you
could also wrap Hash with another class which wraps keys on insertion
and unwraps on reading.
…

In that case I’d probably rather do this:
…

And thank you for the excellent ideas. Manually generating the key
seems rather obvious now that you mention it. I had wondered if there
was something more useful I could do with the values, but I had not
considered manipulating the keys. Thanks again!

Ryan

rpheath · May 24, 2007, 5:55pm

On May 23, 12:18 pm, Robert K. [email protected] wrote:

You could wrap your keys in a custom class. For added convenience you
could also wrap Hash with another class which wraps keys on insertion
and unwraps on reading.
…

In that case I’d probably rather do this:

my_array.inject({}) {|hs, obj| hs[obj.key] ||= obj; hs }.values

Now I remember why I wanted my earlier approach. My object has
several Array member variables. By default, my #hash and #eql?
methods use all of them. In one case I need to ignore 2 of the 4
arrays, e.g. for a uniqueness hash. I can use your second method
thus.

my_array.inject({}) {|hs, obj| hs[obj.arr1 + obj.arr2] ||= obj;
hs }.values

But won’t this create a bevy of new Array objects with the “+” method
(two for each item in the original array)? I could instead pre-hash
the values something like the following.

my_array.inject({}) {|hs, obj| hs[obj.arr1.hash ^ obj.arr2.hash] ||=
obj; hs }.values

This avoids the new Array spew, but I will lose unique objects if my
fabricated hash keys clash (i.e. same hash key for unique arrays). If
I can specify different methods to use in place of #hash and #eql?, I
can get the same behavior without the performance penalty. (My code
is already slow.)

I think I’ll use your wrapping Hash idea. Thanks!

rpheath · May 24, 2007, 11:27pm

2007/5/24, Ryan H. [email protected]:

Now I remember why I wanted my earlier approach. My object has
several Array member variables. By default, my #hash and #eql?
methods use all of them. In one case I need to ignore 2 of the 4
arrays, e.g. for a uniqueness hash. I can use your second method
thus.

my_array.inject({}) {|hs, obj| hs[obj.arr1 + obj.arr2] ||= obj;
hs }.values

Appending is not safe:

[1,2] + [3,4,5] → [1,2,3,4,5]
[1,2,3] + [4,5] → [1,2,3,4,5]

You want to be creating new arrays with two elements:

my_array.inject({}) {|hs, obj| hs[[obj.arr1, obj.arr2]] ||= obj; hs
}.values

But won’t this create a bevy of new Array objects with the “+” method
(two for each item in the original array)?

Yes, but I’d rather first try the suggestion above out before starting
to think about optimizations.

I could instead pre-hash
the values something like the following.

my_array.inject({}) {|hs, obj| hs[obj.arr1.hash ^ obj.arr2.hash] ||=
obj; hs }.values

That’s a bad idea because of the nature of hash values, as you
correctly identified:

This avoids the new Array spew, but I will lose unique objects if my
fabricated hash keys clash (i.e. same hash key for unique arrays). If
I can specify different methods to use in place of #hash and #eql?, I
can get the same behavior without the performance penalty. (My code
is already slow.)

I think I’ll use your wrapping Hash idea. Thanks!

I’d start with the other approach which is considerably easier.

robert