Forum: Ruby newbie q: stripping duplicates

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
unknown (Guest)
on 2008-10-11 17:35
(Received via mailing list)
This is certainly well known, but not to me.

>> a  = [{"aa"=>"bb"},{"aa"=>"bb"}]
=> [{"aa"=>"bb"}, {"aa"=>"bb"}]

>> a.uniq
=> [{"aa"=>"bb"}, {"aa"=>"bb"}]

Why? and, what should I use instead of .uniq
to remove the duplicate?

Thank you
Piero
Stefano C. (Guest)
on 2008-10-11 17:42
(Received via mailing list)
Alle Saturday 11 October 2008, removed_email_address@domain.invalid ha scritto:
> Why? and, what should I use instead of .uniq
> to remove the duplicate?
>
> Thank you
> Piero

It works for me (with ruby-1.8.7-p72):

irb(main):005:0> a = [{"aa" => "bb"}, {"aa" => "bb"}]
=> [{"aa"=>"bb"}, {"aa"=>"bb"}]
irb(main):006:0> a.uniq
=> [{"aa"=>"bb"}]

Which version of ruby are you using?

Stefano
Craig D. (Guest)
on 2008-10-11 17:49
(Received via mailing list)
I get the same results as Piero on 1.8.6 p114, the most recent built-in
version on Mac OS X 10.5.5 (unless an update was included in the recent
security update that I haven't yet applied).

>> a  = [{"aa"=>"bb"},{"aa"=>"bb"}]
=> [{"aa"=>"bb"}, {"aa"=>"bb"}]
>> a.uniq
=> [{"aa"=>"bb"}, {"aa"=>"bb"}]
>> exit
slapshot:~ cdemyanovich$ ruby -v
ruby 1.8.6 (2008-03-03 patchlevel 114) [universal-darwin9.0]

Craig
unknown (Guest)
on 2008-10-11 17:50
(Received via mailing list)
> Stefano
$ ruby --version
$ ruby 1.8.6 (2008-03-03 patchlevel 114) [universal-darwin9.0]

I guess I am out of luck?
In case, any 1.8.6 solution?
Piero
Marcin M. (Guest)
on 2008-10-11 18:00
(Received via mailing list)
removed_email_address@domain.invalid pisze:
>
1.8.7 and 1.9.x use deep hashing for hashes, to achieve that in 1.8.6
you need to monkey patch Hash: http://pastie.org/pastes/272194

lopex
unknown (Guest)
on 2008-10-11 18:10
(Received via mailing list)
On Oct 11, 3:59 pm, Marcin Miel¿yñski <removed_email_address@domain.invalid> 
wrote:
>
> you need to monkey patch Hash:http://pastie.org/pastes/272194
>

Monkeypatched. What a shame.

Thanks!
Piero
Thomas B. (Guest)
on 2008-10-11 18:34
Marcin Mielżyński wrote:
> 1.8.7 and 1.9.x use deep hashing for hashes, to achieve that in 1.8.6
> you need to monkey patch Hash: http://pastie.org/pastes/272194

So I believe this is sort of a bug in the old version? Because now I can
even get to this absurd:

irb(main):007:0> z={a[0]=>:x,a[1]=>:y}
=> {{"aa"=>"bb"}=>:x, {"aa"=>"bb"}=>:y} # absurd number 1
irb(main):008:0> z[{"aa"=>"bb"}]
=> nil # absurd number two

with the three {"aa"=>"bb"} object still reported to be ==.

TPR.
Sebastian H. (Guest)
on 2008-10-11 19:16
(Received via mailing list)
Thomas B. wrote:
> with the three {"aa"=>"bb"} object still reported to be ==.

Yes, but not eql?. In 1.8.6 there were no Hash#hash and Hash#eql?
methods so
it used Object#hash and Object#eql?, which considers two objects equal
only
when they're actually the same object.

HTH,
Sebastian
Marcin M. (Guest)
on 2008-10-11 19:21
(Received via mailing list)
Thomas B. pisze:
> => nil # absurd number two
>
> with the three {"aa"=>"bb"} object still reported to be ==.
>

It's not an absurd, just a consequence Hash doesn't have it's own hash
(just default Object#hash). I agree it's a bit surprising but deep
hashing is slower by a fair amount.

Comparison function will not even be called here since it fails earlier,
at hash bucket lookup:

a = {"aa"=>"bb"}
b = {"aa"=>"bb"}
a.hash != b.hash

Btw, Hash doesn't use "==" for object comparison, it uses "eql?"

lopex
Shot (Piotr S.) (Guest)
on 2008-10-20 03:09
(Received via mailing list)
Marcin Mielżyński:

> 1.8.7 and 1.9.x use deep hashing for hashes, to achieve that in
> 1.8.6 you need to monkey patch Hash: http://pastie.org/pastes/272194

Right, and I use ‘alias eql? ==’ and hand-crafted hash methods
in most of my classes that need to be sane Set elements.

I understand the idea behind the hash method (let’s have a quick way
to check whether two objects are different – and look closely with eql?
only if their hashes are the same), but I wonder whether there are any
rules-of-thumb for finding the sweet spot between making it fast and
making it return different results for different objects often.

For example, is 137 in the above pastie snippet a ‘magic number’ that
it’s good to multiply by? I understand how the bitwise XORs make every
key and value impact the hash, but why not also multiply by 137 between
the key+value iterations?

-- Shot
This topic is locked and can not be replied to.