Counting how many times the same elements occurs in an array?

There’s probably a fairly simple way to do this.

Basically I’m reading data from an xml file, I need to figure out how
many times identical data occurs in certain attributes, so far I’ve got
the data into two identical arrays and had the intention of nesting
iterators - seeing if the element was equal to the second and
incrementing every time a match was found. That obviously didn’t work
out the way I initially thought.

This seems to be the jist of what I want but it’s obviously returning a
count on every iteraton whereas I only want the final tally.

xml_events.each{|x|
puts “#{x} occurs #{xml_events.count(x)} times”
}

Any ideas?

You didn’t mention what a particular xml_event object looks like, but
you’ll probably want something like this:

xml_events.group_by(&:name).each do |name, events|
puts “there were #{events.size} events of type #{name}”
end

~ jf

John F.
Principal Consultant, BitsBuilder
LI: http://www.linkedin.com/in/johnxf
SO: User John Feminella - Stack Overflow

On 15.05.2011 12:01, Thomas Greenwood wrote:

count on every iteraton whereas I only want the final tally.

xml_events.each{|x|
puts “#{x} occurs #{xml_events.count(x)} times”
}

Any ideas?

Two possible approaches:

irb(main):002:0> a = Array.new(10) { rand(4) }
=> [3, 2, 2, 1, 3, 3, 2, 3, 3, 3]

irb(main):003:0> a.inject(Hash.new(0)) {|sums,x| sums[x] += 1; sums}
=> {3=>6, 2=>3, 1=>1}

irb(main):004:0> a.group_by {|x| x}
=> {3=>[3, 3, 3, 3, 3, 3], 2=>[2, 2, 2], 1=>[1]}
irb(main):005:0> a.group_by {|x| x}.map {|k,v| [k, v.size]}
=> [[3, 6], [2, 3], [1, 1]]

Instead of #inject you can of course also use a more traditional
approach:

irb(main):012:0> counts = Hash.new 0
=> {}
irb(main):013:0> a.each {|x| counts[x] += 1}
=> [3, 2, 2, 1, 3, 3, 2, 3, 3, 3]
irb(main):014:0> counts
=> {3=>6, 2=>3, 1=>1}

Kind regards

robert

I’m sure your solutions are better than mine, what I ended up doing;

xml_events = Array.new
temp_array = Array.new

[…]
#extract xml data and assign it to the events array.
[…]

xml_events.each{|x|
if temp_array.include?(x) == false
temp_array << x
puts “#{x} occurs #{xml_events.count(x)} times”
end
}

A kludge but it does the job.

Thanks for your help.

On 15.05.2011 15:06, Thomas Greenwood wrote:

if temp_array.include?(x) == false
This is dangerous: in Ruby false and nil are treated as boolean false.
It’s better to not compare with boolean constants but rather to use
boolean operators and logic. In your case you could do

if !temp_array.include?(x)
unless temp_array.include?(x)

temp_array<< x
puts “#{x} occurs #{xml_events.count(x)} times”
end
}

A kludge but it does the job.

Your code has effort O(n*n) if I am not mistaken while the approach with
the Hash storage of counters only has O(n). That might not really make
a difference in your case but from the fact that you are iterating
xml_events over and over again (same for temp_array btw.) you might see
that it is “ugly” in a way.

Thanks for your help.

You’re welcome.

Kind regards

robert

On Sun, May 15, 2011 at 9:01 PM, 7stud – [email protected]
wrote:

A kludge but it does the job.

After asking for advice on a computer programming forum, the chosen
solution should never be a kludge. Rather, the solution should be
elegant and inspiring, and you should learn somethin.

If it isn’t touted as the Perfect Solution to the problem, then it’s
much
better that we have input from people instead of having no input because
of
a very high bar. At least people submitting their ideas means we get to
see
how people might approach the problem even if it’s potentially mistaken.
In
having almost-correct code there’s a record down which others can learn
from.

Agreed, I’m not a huge fan of the solution. John and Robert’s are much
more straightforward, reusable, and elegant.

You shouldn’t have to use a variable called temp_array … almost ever.
On Sunday, 15 May 2011 at 4:01 pm, 7stud – wrote:

Thomas Greenwood wrote in post #998795:

A kludge but it does the job.

After asking for advice on a computer programming forum, the chosen
solution should never be a kludge. Rather, the solution should be
elegant and inspiring, and you should learn something.