Better way to accumulate totals

I am just learning RUBY and am very impressed, I one week I’ve built
tree routines that will save my group 6 to 7 hours each week, and they
run in 2 minutes. Pretty cool. I would appreciate a suggest on how to do
some. I have a script that correctly parses a directory full of text
file and extracts key data. The data lines look like this:

" category value"

I have 15 categories and the code below works. But is crude. I create 15
global arrays and match each line “category” to the text trying in the
case statement. Would hash or an object work better. Just looking for
pointer of which way to direct my research. Thanks!

Below is part of a big loop that goes through each found line for each
document, and is in its own method.

case top_event[0].lstrip.rstrip
when “Redo size:”

        $redo_cnt[0] += 1
        $redo_cnt[1] += top_event[1].to_f



when "Logical reads:"

        $log_read_cnt[0] += 1
        $log_read_cnt[1] += top_event[1].to_f

On Oct 30, 2007, at 2:41 PM, Robert K. wrote:

when "Logical reads:"

        $log_read_cnt[0] += 1
        $log_read_cnt[1] += top_event[1].to_f

count = Hash.new{|h,k| h.update k => Hash.new{|h,k| h.update k => 0}}

case top_event[0].strip
when /redo size:/i
count[:redo][0] += 1
count[:redo][1] += Float(top_event[1])

when /logical reads:/i
count[:read][0] += 1
count[:read][1] += 1

require ‘yaml’

y count

is one approach.

cheers.

a @ http://codeforpeople.com/

ara.t.howard wrote:

count = Hash.new{|h,k| h.update k => Hash.new{|h,k| h.update k => 0}}

This works…

count = Hash.new{|h,k| h[k] = Hash.new{|h1,k1| h1[k1] = 0}}
=> {}
count[:foo][1] += 1
=> 1
count[:foo][1] += 1
=> 2
count[:foo][1] += 1
=> 3

On Oct 30, 2007, at 3:40 PM, Joel VanderWerf wrote:

This works…

count = Hash.new{|h,k| h[k] = Hash.new{|h1,k1| h1[k1] = 0}}
=> {}
count[:foo][1] += 1
=> 1
count[:foo][1] += 1
=> 2
count[:foo][1] += 1
=> 3

oh yeah, of course :wink: i’m in the habbit of using ‘update’ from
‘inject’ where is saves you having to return the hash itself…
shorter is better always.

cheers.

a @ http://codeforpeople.com/

ara.t.howard wrote:

=> 2
count[:foo][1] += 1
=> 3

oh yeah, of course :wink: i’m in the habbit of using ‘update’ from
‘inject’ where is saves you having to return the hash itself… shorter
is better always.

…but only if returning the hash is the right thing to do. In this
case, we want the “leaf” value, not the hash:

irb(main):034:0> count = Hash.new{|h,k| h.update k => Hash.new{|hh,kk|
hh.update kk => 0}}
=> {}
irb(main):035:0> count[:foo][1] += 1
NoMethodError: undefined method `+’ for {1=>{}, :foo=>{}}:Hash
from (irb):35
irb(main):036:0> count
=> {1=>{}, :foo=>{}}

Robert K. wrote:

The data lines look like this:

" category value"

case top_event[0].lstrip.rstrip
when “Redo size:”

        $redo_cnt[0] += 1
        $redo_cnt[1] += top_event[1].to_f



when "Logical reads:"

        $log_read_cnt[0] += 1
        $log_read_cnt[1] += top_event[1].to_f

This might be easier to understand:

#Create a hash, so that when you use a
#key that doesn’t exist, it creates the key,
#assigns it the array [0,0], and returns the
#array:
categories = Hash.new() do |hash, key|
hash[key] = [0,0]
end

#Loop over each line in a file:
File.foreach(“data.txt”) do |line|
arr = line.split(":")
cat = arr[0].strip
val = Float(arr[1]) #causes an error if can’t convert

categories[cat][0] += 1
categories[cat][1] += val
end

p categories
puts categories[“Make pie”][0]
puts categories[“Redo size”][1]

Using this data:

Redo size: 1.1
Logical reads: 2.1
Redo size: 1.1
Hello world: 3.1
Make pie: 4.1
Hello world: 3.1
Make pie: 4.1
Make pie: 4.1

this is the output:

{“Make pie”=>[3, 12.3], “Hello world”=>[2, 6.2], “Redo size”=>[2, 2.2],
“Logical reads”=>[1, 2.1]}
3
2.2

2007/10/30, Robert K. [email protected]:

I am just learning RUBY and am very impressed, I one week I’ve built tree routines that will save my group 6 to 7 hours each week, and they run in 2 minutes. Pretty cool. I would appreciate a suggest on how to do some. I have a script that correctly parses a directory full of text file and extracts key data. The data lines look like this:

" category value"

I have 15 categories and the code below works. But is crude. I create 15 global arrays and match each line “category” to the text trying in the case statement. Would hash or an object work better. Just looking for pointer of which way to direct my research. Thanks!

Here is a different approach. I start by creating a statistics object:

Statistics = Struct.new :count, :sum

stats = Hash.new {|h,k| h[k] = Statistics.new(0, 0)}

File.foreach “foo.dat” do |line|
if /^\s+\b(.+?)\b\s+(\d+.\d+)/ =~ line
s = stats[$1]
s.count += 1
s.sum += Float($2)
end
end

Note, you might have to tweak the regexp.

Kind regards

robert