Curve fitting to data

People,

Does anyone know of a Ruby app that will fit a curve to data eg fitting
a curve to:

-10 0
-9 19
-8 36
-7 51
-6 64
-5 75
-4 84
-3 91
-2 96
-1 99
0 100
1 99
2 96
3 91
4 84
5 75
6 64
7 51
8 36
9 19
10 0

Should give the formula for a parabola.

Thanks,

Phil.


Philip R.

Pricom Pty Limited (ACN 003 252 275 ABN 91 003 252 275)
GPO Box 3411
Sydney NSW 2001
Australia
Fax: +61:(0)2-8221-9599
E-mail: [email protected]

On 16/12/2007, Phil R. [email protected] wrote:

People,

Does anyone know of a Ruby app that will fit a curve to data eg fitting
a curve to:

-10 0
-9 19

This looks like a prime candidate for GNUPlot – which ruby has bindings
for.

– Thomas A.

Thomas,

On Sun, 2007-12-16 at 23:55 +0000, Thomas A. wrote:

On 16/12/2007, Phil R. [email protected] wrote:

People,

Does anyone know of a Ruby app that will fit a curve to data eg fitting
a curve to:

-10 0
-9 19

This looks like a prime candidate for GNUPlot – which ruby has bindings for.

I think GNUPlot requires the knowledge of the type of fn you are trying
to fit - I want the software to TELL me what sort of fn it is eg for the
data:

-10 0
-9 19
-8 36
-7 51
-6 64
-5 75
-4 84
-3 91
-2 96
-1 99
0 100
1 99
2 96
3 91
4 84
5 75
6 64
7 51
8 36
9 19
10 0

http://www.zunzun.com tells me that the formula for this data is:

y = a( atan(x) ) + b( x2 ) + c( sinh(x) ) + offset

I would like to be able to do this myself with my own (preferably Ruby)
code.

Thanks,

Phil.

Philip R.

Pricom Pty Limited (ACN 003 252 275 ABN 91 003 252 275)
GPO Box 3411
Sydney NSW 2001
Australia
Fax: +61:(0)2-8221-9599
E-mail: [email protected]

Whoa, that looks interesting… cant say I know anything to help you,
but I will “bookmark” this thread to see if someone interesting emerges
here :slight_smile:

Alex F. wrote:

-9 19
I’ve found to have a steeper curve than I cared to climb.
But nothing I know of exactly in Ruby. You might take a look at the
source code for the website you linked to:
CVS Info for project pythonequations

Maybe a fun one for a Ruby Q. sometime?

alex

Well … let’s see:

  1. There’s no such thing as a “universal curve fitting algorithm”, in
    the sense that you give it a set of points and it spits out a formula.
    The main reason is that given any finite set of points, there are an
    infinite number of possible curves that can be drawn through them
    exactly. Some of these are fairly well behaved outside the range of the
    input points, and some of them exhibit bizarre behavior.

  2. There’s no such thing as a curve-fitting problem in a total vacuum,
    without any context whatsoever. In other words, a request to fit a curve
    to a set of points is meaningless without knowledge of how you will use
    that fit.

  3. Many of the “naive” techniques, like polynomial regression, behave
    very badly. By “behave very badly”, I mean “much worse than plotting the
    points in Excel and making a wild guess.”

Given all that, I don’t see much hope of finding or writing a Ruby
program to do this. Conversely, if the original poster has a model that
the points should have come from, based on some theory, writing Ruby
code (or R or Python code) to fit the data to the model is easy. Just
about every reasonable fitting algorithm is already coded in R, so if
you just want an answer without all that pesky learning stuff, it’s
probably easier to do it in R.

Phil R. wrote:

I think GNUPlot requires the knowledge of the type of fn you are trying
to fit - I want the software to TELL me what sort of fn it is eg for the
data:

You can access the “R” statistical package via Ruby, which seems to have
curve fitting capabilities. But this would involve learning R, which
I’ve found to have a steeper curve than I cared to climb.

More generally, I think for most scientific purposes, it’s a good idea
to have an idea of the type of curve (power, quadratic, cubic etc) that
might underlie the observed data. Most applications enforce this. If you
know the general form of the equation that’s being fitted ( eg ax^2 + bx

  • c for a quadratic), it would be possible to get estimates for a, b and
    c by using eg ordinary least squares to find a solution that minimises
    the difference between observed and predicted values. How you get to
    that with acceptable time is an algorithmic question…

But nothing I know of exactly in Ruby. You might take a look at the
source code for the website you linked to:

Maybe a fun one for a Ruby Q. sometime?

alex

On Dec 16, 7:07 pm, Phil R. [email protected] wrote:

-4 84
7 51

Thanks,

Phil.

Isn’t the result zunzun spit out “wrong”? The data is most simply
described as an inverted parabola:

y = a x^2 + c

This points exactly to the problem others have mentioned. Given a
large enough space of functions, you can fit practically anything, but
what does it mean? Generally speaking, no one has any business
fitting 21 data points with 4 parameters.

JM

Alex,

On Mon, 2007-12-17 at 11:40 +0900, Alex F. wrote:

-9 19

source code for the website you linked to:
CVS Info for project pythonequations

Maybe a fun one for a Ruby Q. sometime?

I installed grace and that does pretty much what you said, which is not
quite what I want but interesting . .

Yeh, I did have a look at the PythonEquations stuff but it looks too
tough for me to translate/make use of - I would certainly be happy if
someone wanted to make it a Ruby Q.!

Regards,

Phil.

Philip R.

Pricom Pty Limited (ACN 003 252 275 ABN 91 003 252 275)
GPO Box 3411
Sydney NSW 2001
Australia
Fax: +61:(0)2-8221-9599
E-mail: [email protected]

-------- Original-Nachricht --------

Datum: Mon, 17 Dec 2007 15:31:03 +0900
Von: Phil R. [email protected]
An: [email protected]
Betreff: Re: Curve fitting to data

data:
-1 99
10 0

fitting 21 data points with 4 parameters.
Dear Phil,

it is of course preferable to have some idea about the underlying
relationship between data graphed, such as

y = ax^2 + bx + c,

and then fit that model (this can be done by solving a linear
equation,

Matrix([x_0^2,x_0,1],…,[x_n^2,x_n,1])*([a,b,c]^transpose)=[y_0,…,y_n]^transpose

(numbering data points as ((x_0,y_0),…(x_n,y_)) and
[…] indicating rows in the matrix or row vectors)),

as this is a linear equation in the parameters a,b,c .
You can do that with any software that solves linear or matrix
equations, i.e., rsruby or rb-gsl .

It is of course also true that one can basically draw arbitrary
curves to connect data points, if you don’t know that a model
like the above is “true”.

Now, one additional line of thought is pursued in the discipline
of “approximation theory” (see eg., Wikipedia, or for a deeper
insight,
Approximation Theory and Methods - M. J. D. Powell - Google Books).

Here, one starts with a points, as yours, and asks,

Given a distance measure between the data and the curve (“norm”) and a
set of admissible model curves (e.g., all continuous curves on an
interval),
which curves will minimize that norm ?

There are indeed some results available, such as Chebyshev or
Remez(Remes) approximation procedures.

This kind of procedure can be recommended when the functional
relationship of your data is rather complicated/not enormously
interesting/you distrust
simple models, you know something about the general wiggliness of the
underlying curve (see the Jackson theorems in Powell’s book), and you
need to have information about what you would have measured at some
point you didn’t actually measure and the result should be not too far
off …

Best regards,

Axel

JM,

On Mon, 2007-12-17 at 15:10 +0900, [email protected] wrote:

-6 64
5 75
I would like to be able to do this myself with my own (preferably Ruby)

This points exactly to the problem others have mentioned. Given a
large enough space of functions, you can fit practically anything, but
what does it mean? Generally speaking, no one has any business
fitting 21 data points with 4 parameters.

Sorry, my cut and paste had a typo - I left out the “^”, the formula
should have been:

     y = a( atan(x) ) + b( x^2 ) + c( sinh(x) ) + offset

which is closer to the parabola (and the zunzun display on the screen
seemed perfect) but yes, you are correct.

Thanks,

Phil.

Philip R.

Pricom Pty Limited (ACN 003 252 275 ABN 91 003 252 275)
GPO Box 3411
Sydney NSW 2001
Australia
Fax: +61:(0)2-8221-9599
E-mail: [email protected]

A quadratic indeed fits this function very well. The
zunzun.com function finder was run with the option
to try every possible function, rather than limit
to simple curves only. This can be done on the site
in two ways:

  1. Use the function finder “Smoothness Control” to
    only allow functions with a few coefficients

  2. Use the function finder “Equation Family Inclusion”
    and disallow the polyfunctionals - these generate
    many thousands of basically random functions to test.

    James Phillips
    http;//zunzun.com
    2548 Vera Cruz Drive
    Birmingham, AL 35235 USA
    zunzun at zunzun.com