Comparing hashes based on their keys

Hi list,
let’s say we want to write a method that compares two hashes for
equality based on their keys.

Each key is a symbol, and the corrisponding value can either be a string
or another hash with this same properties.

Two hashes, let’s call them h1 and h2, are to be considered equal if:

  • they have the same keys

  • h1[:x] and h2[:x] are both hashes, and they are equal according to the
    very same rules you are reading :wink:

  • they are both nil

In short, we only care about the values when they are hashes.
If they are strings, we don’t care whether they are equal or not.

To make an example, when fed these two hashes the method should return
true:

h1 = {
:one => “one”,
:two => “two”,
:three => {
:alpha => “alpha”,
:bravo => “bravo”,
:charlie => “charlie”
}
}

h2 = {
:one => “whatever”,
:two => “hey”,
:three => {
:alpha => “zulu”,
:bravo => “oscar”,
:charlie => “brown”
}
}

When fed these other two arrays, the method should return false:

h3 = h1

h4 = {
:one => “one”,
:two => “two”,
:three => {
:alpha => “alpha”,
:bravo => “bravo”,
}
}

The difference is that Ole :charlie is missing in h2.
The values don’t change to make it clear that we don’t care about them.

I came up with the following implementation that seems to work (at least
according to the specs I wrote), but what I would like to ask is if it
could be any simpler/more idiomatic/more elegant.

Corner cases that one might spot are also good to know about.

def compare(hash1, hash2)
args = [hash1, hash2]

return true if args.all? {|h| h.nil?}
return false if args.one? {|h| h.nil?}

hash1.each_key do |k|
values = [hash1[k], hash2[k]]

 if values.all? {|h| h.is_a?(Hash)}
   return compare(*values)
 else
   return false if values.one? {|value| value.nil? }
 end

end

true
end

Should the code turn out to be unreadable, I posted it here[0] too.

[0] = Comparing hashes based on their keys · GitHub

Thanks in advance.

On Fri, Jul 15, 2011 at 5:38 PM, Stefano M. [email protected]
wrote:

  • h1[:x] and h2[:x] are both hashes, and they are equal according to the
    very same rules you are reading :wink:

  • they are both nil

It seems your spec is not very precise because your bullet list mixes
AND and OR. I assume you meant

Two objects o1 and o2 are considered equal for this relation if and only
if

( o1 is nil AND o2 is nil ) OR
( o1 is a Hash AND o2 is a Hash AND
o1.keys == o2.keys AND
for each key k ( o1[k] is not a Hash OR
o2[k] is not a Hash OR
o1[k] equals o2[k] according to this relation
) )

:bravo => “bravo”,
:charlie => “brown”
:three => {
be any simpler/more idiomatic/more elegant.

Corner cases that one might spot are also good to know about.

def compare(hash1, hash2)

I would rename arguments because these need not be Hashes.

else
return false if values.one? {|value| value.nil? }
end
end

true
end

Hm, I’d probably remove #all?, #any? and the like and code conditions
directly - especially since you are always dealing with two items
because it’s likely faster. Your code creates a lot of temporary
Arrays along the way.

Btw, I believe your implementation misses the case where hash2’s key
set is a superset of hash1’s, i.e. contains additional keys.

Kind regards

robert

On Jul 15, 2011, at 12:04 PM, Robert K. wrote:

It seems your spec is not very precise because your bullet list mixes
o1[k] equals o2[k] according to this
relation ) )

I don’t see why you couldn’t just add your own method to Hash. (Tested
on ruby-1.9.2)

class Hash
def same_key_structure?(other)
return false unless other.is_a?(Hash)
return false unless self.keys == other.keys && self.keys.all?{||
.is_a?(Symbol)}
self.keys.all? {|
|self[
].is_a?(Hash) ?
self[].same_key_structure?(other[]) : !other[_].is_a?(Hash)}
end
end

If you really cared about the nil case, you could do:

class NilClass
def same_key_structure?(other)
other.nil?
end
end

But that’s probably a stretch and only you can decided if it’s right
for the situation.

-Rob

P.S. I’m assuming that you don’t consider supersets of keys to be
“equal” and that you really do care that all the keys are symbols.

:three => {
:alpha => “zulu”,
:one => “one”,

I would rename arguments because these need not be Hashes.

 return compare(*values)

because it’s likely faster. Your code creates a lot of temporary
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

Rob B.
[email protected] http://AgileConsultingLLC.com/
[email protected] http://GaslightSoftware.com/

On Fri, Jul 15, 2011 at 10:38 AM, Stefano M.
[email protected]wrote:

h2 = {

args = [hash1, hash2]
return false if values.one? {|value| value.nil? }

Thanks in advance.


Stefano

Looks like you’ve probably got a bug. You explicitly return in your
code,
causing it to stop evaluating after the first hash:

def compare(hash1, hash2)
args = [hash1, hash2]

return true if args.all? {|h| h.nil?}
return false if args.one? {|h| h.nil?}

hash1.each_key do |k|
values = [hash1[k], hash2[k]]

if values.all? {|h| h.is_a?(Hash)}
return compare(*values)
else
return false if values.one? {|value| value.nil? }
end
end

true
end

h1 = {
:one => “one”,
:two => { :alpha => “alpha” },
:three => { :alpha => “alpha” },
}

h2 = h1.merge :three => { :bravo => ‘bravo’ }

h1 # => {:one=>“one”, :two=>{:alpha=>“alpha”},
:three=>{:alpha=>“alpha”}}
h2 # => {:one=>“one”, :two=>{:alpha=>“alpha”},
:three=>{:bravo=>“bravo”}}
compare h1, h2 # => true
RUBY_VERSION # => “1.9.2”

On Fri, Jul 15, 2011 at 10:38 AM, Stefano M.
[email protected]wrote:

Should the code turn out to be unreadable, I posted it here[0] too.

[0] = https://gist.github.com/**1084916 https://gist.github.com/1084916

What would be especially cool is if you could post your specs, then it
would
be a lot easier to try and refactor it.

On 7/15/11 7:21 PM, Josh C. wrote:

What would be especially cool is if you could post your specs, then it would
be a lot easier to try and refactor it.

Here are the specs:

hash comparison spec · GitHub

And here is a slightly modified version of the code that adds the method
to Hash, as suggested by Rob Biederhan.

Comparing hashes based on their keys · GitHub

@Rob Biederhan: I removed the check that ensures that all the keys are
symbols, because I know they will be.

On 7/15/11 7:18 PM, Rob B. wrote:

I don’t see why you couldn’t just add your own method to Hash. (Tested
on ruby-1.9.2)

[snip]

But that’s probably a stretch and only you can decided if it’s right
for the situation.

I posted in the same gist a revised version of your method (see comment
below).
I ended up not monkey-patching NilClass. I will just throw an
ArgumentError if the argument is nil.

P.S. I’m assuming that you don’t consider supersets of keys to be
“equal” and that you really do care that all the keys are symbols.

You’re right about the supersets: if hash x as a key k and hash y misses
that key, the two hashes are different.

About the symbols: I don’t really care, because I know they will be
symbols.

Thank you very much.

On 7/15/11 6:04 PM, Robert K. wrote:

AND and OR. I assume you meant

Two objects o1 and o2 are considered equal for this relation if and only if

( o1 is nil AND o2 is nil ) OR
( o1 is a Hash AND o2 is a Hash AND
o1.keys == o2.keys AND
for each key k ( o1[k] is not a Hash OR
o2[k] is not a Hash OR
o1[k] equals o2[k] according to this relation ) )

Exactly.
Sorry about the confusion.

def compare(hash1, hash2)

I would rename arguments because these need not be Hashes.

Fair enough.

else

Arrays along the way.
True.
On the other hand, the hashes will be fairly small, so performance
wasn’t the main goal.

That, and I the fact that I find:

[x, y].all? {|x| x.is_a?(Hash) }

to be a bit more nice-looking than

if x.is_a?(Hash) && y.is_a?(Hash)

But you make a very good point, thanks.

Btw, I believe your implementation misses the case where hash2’s key
set is a superset of hash1’s, i.e. contains additional keys.

True. I posted again the link to the gist with a revised version
according to Rob’s suggestions.

Thank you very much, Robert.
Ah, there are so many things to learn.

On 7/15/11 7:31 PM, Josh C. wrote:

Looks like you’ve probably got a bug. You explicitly return in your code,
causing it to stop evaluating after the first hash:

[snip]

Very true.
This should work better. It still returns explicitly, but only if
compare() itself returns false.

def compare(hash1, hash2)
args = [hash1, hash2]

return true if args.all? {|h| h.nil?}
return false if args.one? {|h| h.nil?}

hash1.each_key do |k|
values = [hash1[k], hash2[k]]

if values.all? {|h| h.is_a?(Hash)}
  return false unless compare(*values)
else
  return false if values.one? {|value| value.nil? }
end

end

true
end

Thank you so much.

On 7/15/11 7:18 PM, Rob B. wrote:

class Hash
def same_key_structure?(other)
return false unless other.is_a?(Hash)
return false unless self.keys == other.keys &&
self.keys.all?{||.is_a?(Symbol)}
self.keys.all? {||self[].is_a?(Hash) ?
self[].same_key_structure?(other[]) : !other[_].is_a?(Hash)}
end
end

Now that I think about it, that

self.keys == other.keys

might cause problems, because I didn’t specify (sorry, my bad) that I
don’t care about which order the keys appear in.

In other words, I consider these two hashes to be equal:

h1 = {:x => nil, :y => nil}
h2 = {:y => nil, :x => nil}

but h1.keys == h2.keys would return false in this context.

Array could probably benefit from something like RSpec’s ‘=~’ to test
arrays for equality regardless of the order of their elements, wouldn’t
it?
That is, unless I’m missing some existing method that does that.

On 7/15/11 9:47 PM, Rob B. wrote:

On Jul 15, 2011, at 3:01 PM, Stefano M. wrote:

but h1.keys == h2.keys would return false in this context.

then sort them:

return false unless self.keys.sort == other.keys.sort

Since they are Hashes, I’d certainly hope that the order of the keys
didn’t matter. :wink:

Hah! True.
I must be very tired! :wink:

Thanks again.

On Jul 15, 2011, at 3:01 PM, Stefano M. wrote:

end

h1 = {:x => nil, :y => nil}
h2 = {:y => nil, :x => nil}

but h1.keys == h2.keys would return false in this context.

then sort them:

return false unless self.keys.sort == other.keys.sort

Since they are Hashes, I’d certainly hope that the order of the keys
didn’t matter. :wink:

I should have realized that particularly since 1.9.2’s Hashes return
their keys in an order dependent on their insertion to the Hash. (And
I gave a talk to the Cincinnati Ruby Brigade just last month about
differences between 1.8.6 and 1.9.2 that included this fact.)

-Rob

Array could probably benefit from something like RSpec’s ‘=~’ to
test arrays for equality regardless of the order of their elements,
wouldn’t it?
That is, unless I’m missing some existing method that does that.


Stefano M.

Rob B.
[email protected] http://AgileConsultingLLC.com/
[email protected] http://GaslightSoftware.com/