Mean method

I’m working on a lot of math in my projects so I thought I would convert
over a few methods I used in other languages that I created and port
them to ruby.

Here’s a mean method which allows you to return either:

a) arithmetic mean
b) geometric mean
c) harmonic mean

http://pastie.org/553287

Enjoy.

Hi –

On Tue, 21 Jul 2009, Älphä Blüë wrote:

http://pastie.org/553287

Enjoy.

Thanks for sharing the code. I’ve got a few suggestions for
refactoring it, which I hope you won’t mind my sharing with you.

First of all, if you want to call #sum on an array, you’ll need a #sum
method :slight_smile: I’ve thrown in #product too.

class Array
def sum
inject {|a,b| a + b }
end

 def product
   inject {|a,b| a * b }
 end

end

And here’s a somewhat more concise version of your mean method. Note
that I don’t check the classes of the arguments. I’d rather try to
make them work, and if they don’t then an exception will be raised at
some point anyway. I do, however, add a check to make sure the mean
type is a known one.

def mean(array, i_type = 1)
case i_type
when 1 then array.sum / array.size
when 2 then array.product ** (1.0/array.size)
when 3 then array.size / array.inject {|a,b| a + 1.0 / b }
else
raise ArgumentError, “Unknown mean type ‘#{i_type}’”
end
end

It might be user-friendly to wrap the various types in methods of
their own:

def arithmetic_mean(array)
mean(array, 1)
end

etc.

David

On Jul 21, 2009, at 7:41 AM, Älphä Blüë wrote:

http://pastie.org/553287

The pastie is missing your extension of Array to add the #sum method,
so the example doesn’t work. Also, unless #sum returns a float then it
is possible for your arithmetic mean to return a result truncated to
the nearest integer due to the way Ruby’s division works with integers.

Lastly, I have a style suggestion. In your parameter sanity checks, I
think this reads easier:

raise “mean(#{an_array}) error: #{an_array} is not an array.” unless
an_array.is_a? Array

raise “mean(#{an_array},#{i_type}) error:: #{i_type} mean type is not
an integer.” unless i_type.is_a? Fixnum

raise “mean(#{an_array},#{i_type}) error: invalid mean type
((#{i_type}) set. Only 1…3 valid.” unless (1…3).include? i_type

Thanks for sharing your code.

cr

thanks everyone - looking over the suggestions and rewriting it to be
more polished.

Älphä Blüë wrote:

http://pastie.org/553287

Are you posting here because you want comments/suggestions? If so,
here’s my 2c.

(1) Passing enumerated values 1,2,3 is not very Ruby-like. If you want
to go this way, then I’d suggest symbols: :arithmetic, :geometric,
:harmonic

(2) However, IMO it’s also code smell if you write something like

case A
when X
… do one thing
when Y
… do another thing
… etc
end

There’s usually a better way to ask directly for what you want.

Suggestions:

(a) add some methods directly to class Array, in the same way as ‘sum’

class Array
def arithmetic_mean
self.sum / self.size
end

def geometric_mean
… etc
end
end

puts [1,2,3,4,5].arithmetic_mean

This is perfectly fine within your own apps, but modifying core classes
may not be acceptable in a shared library (depending on who your target
audience is)

(b) wrap an array with an object which does the calculation

class Mean
def initialize(arr)
@arr = arr
end

def arithmetic
@arr.sum / @arr.size
end

… etc
end

puts Mean([1,2,3,4,5]).arithmetic

Note that creating an object like this is a cheap operation in Ruby.
You’re not copying the array elements, just a reference to the array
itself.

(3) It is considered extremely bad form to write

if !an_array.is_a? Array: raise “mean(#{an_array}) error: #{an_array}
is not an array.” end

You should simply leave this out. This lets the user pass in any object
which responds to the particular methods you call on it (each, sum,
size). This is the “duck typing” principle - if someone wants to create
an object which behaves like an Array, even though it is not actually an
Array, why should you forbid them from using it?

If they pass in an incompatible object, Ruby will generate an exception
for you anyway. Admittedly it might not be quite as obvious what’s
happening, but it will typically be something along the lines of

NoMethodError: object does not respond to ‘sum’

which tells you exactly what the problem is.

(4) You can rewrite the following using inject:

  an_array.each do |a|
    n_sum += (1.0 / a)
  end

(5) If I were you, I would rewrite this not to use sum and size, but
just to iterate over the collection using ‘each’ (or something which in
turn calls ‘each’, like ‘inject’), calculating the total and count as
you go. This would allow your code to work on any object which is
Enumerable, such as

(1…5).arithmetic_mean

Regards,

Brian.

Okay round 2,…

Let me know what you think.

http://pastie.org/553287

Sorry, Brian check again.

I hadn’t pasted the right code in. The count_of_num was changed to
arr_size

Älphä Blüë wrote:

Let me know what you think.

http://pastie.org/553287

Mean.new([1,2,3,4,5]).arithmetic
NameError: undefined local variable or method count_of_num' for #<Mean:0xb7bc6904 @arr=[1, 2, 3, 4, 5]> from (irb):25:in arithmetic’
from (irb):36
from :0

Here’s an approach which avoids to_a and size entirely, and iterates
over the collection just once:

class Mean
def initialize(arr)
@arr = arr
end

def arithmetic
count, sum = @arr.inject([0, 0.0]) { |(c,s),e| [c+1, s+e] }
sum/count
end

def geometric
count, product = @arr.inject([0, 1.0]) { |(c,s),e| [c+1, s*e] }
product ** (1.0/count)
end
end

However you can argue that it’s both excessively terse and not as
efficient, and I wouldn’t argue. If the main application of this is to
handle Arrays, then going for the full generality of Enumerable isn’t
necessary.

It’s a learning exercise for us both :slight_smile:

Cheers,

Brian.

I adjusted the code a bit to handle the size factor with @size in
initialize (code changed if you refresh pastie).

My main use is really for arrays but I like making things modular and
useful. I’m almost finished with the standard deviation class which
includes mean…

I perform a lot of advanced mathematics and statistical analysis so I
need some heavier methods here and there…

As I’m still new to ruby this is a great lesson for me as well,
especially with regards to code polishing and adapting to rubyism code
writing.

Älphä Blüë wrote:

From my understanding of inject, using above…

a = first instance of array
b = subsequent instances of array

No, not exactly. You are using the new 1.9 usage (possibly also 1.8.7):

foo.inject { |a,b| … }

a is the first element of the array in the first iteration only. For
subsequent iterations, a is the value of the previous block evaluation.

It may be clearer if you stick to the old usage:

foo.inject(init) { |a,b| … }

In the first invocation, a is init and b is the first element of the
array. For the next invocation, a is the previous block value and b is
the next element of the array. And so on.

Looking at your initial code:

@arr.each do |a|
n += ((a - mean)**2)
std = Math.sqrt(n / @size)
end

This is strange. You’re calculating a value in every iteration and
assigning it to std, but then throwing it away apart from the last one.
Shouldn’t this go outside the loop?

n = 0
@arr.each do |a|
n += ((a - mean)**2)
end
std = Math.sqrt(n / @size)

If that’s correct, then the solution becomes obvious:

n = @arr.inject(0) { |accum,elem| accum + (elem-mean)**2 }
std = Math.sqrt(n / @size)

On Jul 21, 2009, at 10:30 AM, Älphä Blüë wrote:

As I’m still new to ruby this is a great lesson for me as well,
especially with regards to code polishing and adapting to rubyism code
writing.

I’ve written some code to compute skew and kurtosis (the 3rd and 4th
moments) if you want a copy. I’ll send you my whole class if you’d
like. It’s a bit hard to follow because I heavily optimized it to
cache results and/or pass around intermediate results.

cr

Okay on Standard Deviation…

One segment I’m trying to change is:

@arr.each do |a|
n += ((a - mean)**2)
std = Math.sqrt(n / @size)
end

into something like this…?

Math.sqrt( (@arr.inject {|a,b| ((a - @mean)**2) + ((b - @mean)**2) }) /
@size)

… but I don’t think I have the inject working properly…

From my understanding of inject, using above…

a = first instance of array
b = subsequent instances of array

so if @arr = [1,2,3,4,5]

@arr.inject {|a,b| ((a - @mean)2) + ((b - @mean)2)}
1-@mean
2 …reference for a
+
2-@mean
2 …reference for b
+
3-@mean2 …reference for b
+
4-@mean
2 …reference for b
+
5-@mean**2 …reference for b

array finished…

Correct?

Hi –

On Wed, 22 Jul 2009, Brian C. wrote:

a is the first element of the array in the first iteration only. For
subsequent iterations, a is the value of the previous block evaluation.

It may be clearer if you stick to the old usage:

foo.inject(init) { |a,b| … }

You can do the no-arg version in 1.8.6 too:

$ ruby186 -e ‘p [1,2,3].inject {|a,b| a + b }’
6

David

At 2009-07-21 08:41AM, “Älphä Blüë” wrote:

http://pastie.org/553287
You might enjoy perusing code examples at
Category:Mathematical operations - Rosetta Code
to compare with what you currently have.

As a test you would do:

a = [1,2,3,4,5]

Stddev.new(a).newcalc

You can also do Stddev.new(1…5).newcalc

You can rename the newcalc to anything you want…

This is the new Standard Deviation class I created. It’s very
simplified and works really well.

class Stddev
def initialize(arr)
@arr = arr
@size = @arr.to_a.size
end

def sum
@arr.inject {|a,b| a + b }
end

def sumofx
@arr.inject {|a,b| a + b**2}
end

def newcalc
Math.sqrt((sumofx-((sum * sum)/@size))/(@size-1).to_f)
end
end

This is written a bit better and more readable:

class Stddev
def initialize(arr)
@arr = arr
@size = @arr.to_a.size
end

def sumofx
@arr.inject {|a,b| a + b }
end

def sumofxsquared
@arr.inject {|a,b| a + b**2}
end

def calculate
Math.sqrt((sumofxsquared-((sumofx * sumofx)/@size))/(@size-1).to_f)
end
end

a = [1,2,3,4,5]

Stddev.new(a).calculate

Älphä Blüë wrote:

I adjusted the code a bit to handle the size factor with @size in
initialize (code changed if you refresh pastie).

Unfortunately this produces wrong results when the array changes in
between. Consider this:

array = [1,2,3,4]
mean = Mean.new(array)
puts mean.arithmetic # => 2.5
array << 5
puts mean.arithmetic # => 3.75
puts Mean.new(array).arithmetic # => 3.0

As the size is only determined once in Mean#initialize, subsequent
changes
of the array (size) go unnoticed. This behaviour should at least be
documented.

-Matthias