How to calculate variance on the elements of an Array

Implement this algorithm http://mathworld.wolfram.com/Variance.html.

- iterate over each element and calculate the mean of the elements of

your Array - iterate again over each element and take the difference of each

element with the mean, square that value, and then add it to a running

total - that running total is your variance

ary = (1…100).to_a

mean = (ary.inject(0.0) {|s,x| s + x}) / Float(ary.length)

variance = ary.inject(0.0) {|s,x| s + (x - mean)**2}

Another option is to install the NArray library and use that …

nary = NArray.new(NArray::FLOAT, 100)

nary[0…100] = (1…100).to_a

nary.variance

Shameless plug and probably overkill if you just want the variance

RSRuby allows you to use any R function, including the built-in

variance function (‘var’). It’s available as a gem.

irb(main):002:0> require ‘rsruby’

=> true

irb(main):004:0> RSRuby.instance.var([1,2,3])

=> 1.0

irb(main):006:0> RSRuby.instance.var(RSRuby.instance.rnorm(10))

=> 0.812117410016217

irb(main):008:0> RSRuby.instance.var(RSRuby.instance.rnorm(100))

=> 0.960224242747171

mean = (ary.inject(0.0) {|s,x| s + x}) / Float(ary.length)

variance = ary.inject(0.0) {|s,x| s + (x - mean)**2}

You can also write this, if you are going to need it in more places,

in a more generic way:

module Variance

def sum(&blk)

map(&blk).inject { |sum, element| sum + element }

end

def mean

(sum.to_f / size.to_f)

end

def variance

m = mean

sum { |i| ( i - m )**2 } / size

end

def std_dev

Math.sqrt(variance)

end

end

Array.send :include, Variance

puts [1, 2].sum # 3

puts [1, 2].mean # 1.5

puts [1, 2].variance # 0.25

puts [1, 2].std_dev # 0.5

Longer but nicer (maybe

def variance

m = mean

sum { |i| ( i - m )**2 } / size

end

If you wanted it to be a little more efficient you can do it like so:

def variance( arr )

n, mean, s = [0, 0, 0]

arr.each_with_index do |x, n|

delta = (x - mean).to_f

mean += delta/(n+1)

s += delta*(x - mean)

end

s/n

end

It calculates the variance in one pass by calculating the mean and

variance iteratively [1].

-rr-

[1] http://en.wikipedia.org/wiki/Algorithms_for_calculating_variance