Forum: Ruby count : array to array of hashes

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
4b87f676cb6c4d648d71000681823693?d=identicon&s=25 Jean-sébastien Jney (jney)
on 2007-08-01 18:05
(Received via mailing list)
Hi,
  I'm tryind to count distinct ids from an array (array_of_ids =
[1,2,3,1,2]), and want a result array of hashes looking like it :
result = [{;id=>1,:count=>2},{;id=>2,:count=>2},{;id=>3,:count=>1}]

i tried it, but it always returns errors :
result = array_of_ids.inject(Array.new){ |a,x| elt = a.find{|h| h[:id]
== x} || {:id => x, :count => 0}; elt[:count] += 1; elt}

i took my information from this post :
http://groups.google.com/group/comp.lang.ruby/brow...

If someone could help.

regards
703fbc991fd63e0e1db54dca9ea31b53?d=identicon&s=25 Robert Dober (Guest)
on 2007-08-01 18:34
(Received via mailing list)
On 8/1/07, Jean-Sébastien <jeansebastien.ney@gmail.com> wrote:
> 
http://groups.google.com/group/comp.lang.ruby/brow...
>
> If someone could help.
>
> regards
>
>
>
irb(main):005:0> a=[1,2,3,42,2,2,3]
=> [1, 2, 3, 42, 2, 2, 3]
irb(main):006:0> a.inject([]){|r,ele| r[ele][:count]+=1 rescue
r[ele]={:id => ele, :count=>1}; r}.compact
=> [{:count=>1, :id=>1}, {:count=>3, :id=>2}, {:count=>2, :id=>3},
{:count=>1, :id=>42}]
irb(main):007:0>

HTH
Robert
C40020a47c6b625af6422b5b1302abaf?d=identicon&s=25 Stefano Crocco (crocco)
on 2007-08-01 18:56
(Received via mailing list)
Alle mercoledì 1 agosto 2007, Jean-Sébastien ha scritto:
> http://groups.google.com/group/comp.lang.ruby/brow...
>6db8d7f5/04a7b10da8195cec?lnk=gst&q=array++count&rnum=8#04a7b10da8195cec
>
> If someone could help.
>
> regards

There are two problems in your code:
1- in the inject block, you return elt, which is (or should be, if the
code
worked) the hash containing the id which is being processed. You should
return a, that is the array which contains the hashes. Correcting this
should
give a piece of code which executes without errors, but which returns an
empty array
2- you never insert the hashes you create inside the inject block into
the
array a: you only store them in the local variable elt, which gets
destroyed
after each iteration. The inject block should be:

  result = array_of_ids.inject(Array.new){ |a,x|
    elt = ( a.find{|h| h[:id] == x} || a[a.size] = {:id => x, :count =>
0} )
    elt[:count] += 1
    a
  }

As you can see, after the || operator a new hash is created and inserted
at
the end of a (corresponding to the index a.size). Since an assignment
always
return the value being assigned (this is why I didn't use <<, it returns
the
array, not the inserted element), elt is then set to the new hash. Of
course,
all this happens only if find returns nil.

If you can rely in the id to be positive integers, and don't care if the
resulting array contains the hashes in the same order as the id are
stored in
the array, here's another approach you can consider:

  result = array_of_ids.inject([]) do |res, i|
    res[i-1] ||= {:id => i, :count => 0}
    res[i-1][:count] += 1
    res
  end

This code stores the data relative to the id i in the i-1 position in
the
array (the -1 is there to avoid a nil element at the beginning). This
should
make it faster, since you don't need to iterate all the array to check
whether the data corresponding to an id is already there or not: either
is in
the position id - 1 or it itsn't there. A quick benchmark, done by
creating
an array of ids of 100_000 elements, with values randomly chosen between
1
and 11 gives:

                              user     system      total        real
original approach         5.730000   1.490000   7.220000 (  7.310224)
                              user     system      total        real
alternative approach      1.250000   0.150000   1.400000 (  1.416871)

Changing the range of the ids from 11 to 101 gives:
                              user     system      total        real
alternative approach      1.270000   0.190000   1.460000 (  1.472353)
                              user     system      total        real
original approach        37.730000  11.360000  49.090000 ( 51.056527)

Increasing it to 1001 gives
                              user     system      total        real
alternative approach      1.500000   0.220000   1.720000 (  1.733568)

The original approach takes much more time (I didn't have the patience
to wait
for it to complete).

I hope this helps

Stefano
This topic is locked and can not be replied to.