Weird issue with converting floats to integer

Frederick C. wrote in post #970500:

On Dec 24, 2:39pm, Frederick C. [email protected]
wrote:

On Dec 24, 2:21pm, Marnen Laibow-Koser [email protected] wrote:

Which still doesn’t allow you to store all numbers with arbitrary
precision (in fact putting my mathematical hat on, most numbers can’t
be stored like this). You’ll have to deal with numerical error
eventually

Talking slight bollocks. I meant that you can’t store with infinite
precision, ie you’ll always be liable for some error.

Of course. You know a better way? Should we just store everything as
Rational?

Fred

Best,

Marnen Laibow-Koser
http://www.marnen.org
[email protected]

Sent from my iPhone

On Dec 24, 2:47pm, Marnen Laibow-Koser [email protected] wrote:

Talking slight bollocks. I meant that you can’t store with infinite
precision, ie you’ll always be liable for some error.

Of course. You know a better way? Should we just store everything as
Rational?

Well that wouldn’t solve things anyway. Just didn’t wanted to make
sure no one was under the illusion that there’s some sort of silver
bullet for this

Fred

On Dec 24, 2:39pm, Frederick C. [email protected]
wrote:

On Dec 24, 2:21pm, Marnen Laibow-Koser [email protected] wrote:

Which still doesn’t allow you to store all numbers with arbitrary
precision (in fact putting my mathematical hat on, most numbers can’t
be stored like this). You’ll have to deal with numerical error
eventually

Talking slight bollocks. I meant that you can’t store with infinite
precision, ie you’ll always be liable for some error.

Fred

On 24 December 2010 14:47, Marnen Laibow-Koser [email protected]
wrote:

Talking slight bollocks. I meant that you can’t store with infinite
precision, ie you’ll always be liable for some error.

Of course. You know a better way? Should we just store everything as
Rational?

For things that are decimal numbers (such as money) use Fixed or
BigDecimal and for things that are not decimal then use Float and
determine the errors if it is important.

Colin

On 24 December 2010 16:36, Marnen Laibow-Koser [email protected]
wrote:

BigDecimal and for things that are not decimal then use Float and
determine the errors if it is important.

That’s basically what I’ve been saying, except for “decimal” substitute
“rational”. However, there’s not always an a priori way to determine if
a given field will need to store irrational numbers.

I am not quite sure that is that is the same as “never, ever use
Floats for arithmetic” :slight_smile:

Also as we have seen for rational numbers like 1/3 BigDecimal is no
better than float, except that it is possible to specify a smaller or
larger precision than that which float supplies, and it is a decimal
precision rather than a binary one.

Colin

Colin L. wrote in post #970523:

On 24 December 2010 16:36, Marnen Laibow-Koser [email protected]
wrote:

BigDecimal and for things that are not decimal then use Float and
determine the errors if it is important.

That’s basically what I’ve been saying, except for “decimal” substitute
“rational”. However, there’s not always an a priori way to determine if
a given field will need to store irrational numbers.

I am not quite sure that is that is the same as “never, ever use
Floats for arithmetic” :slight_smile:

Oh, now I see the difference. Even if I store something as a Float, I’d
probably use BigDecimal for arithmetic. I simply don’t believe that
Float arithmetic has any place at all in a wise programmer’s repertoire.
Consider it the 21st-century goto, if you like. :slight_smile:

Also as we have seen for rational numbers like 1/3 BigDecimal is no
better than float, except that it is possible to specify a smaller or
larger precision than that which float supplies,

That makes it better than Float right there. Also, you don’t accumulate
any more error.

and it is a decimal
precision rather than a binary one.

…which is not generally better or worse.

Colin

Best,

Marnen Laibow-Koser
http://www.marnen.org
[email protected]

Sent from my iPhone

On 24 December 2010 17:29, Marnen Laibow-Koser [email protected]
wrote:

I am not quite sure that is that is the same as "never, ever use
larger precision than that which float supplies,

That makes it better than Float right there. Also, you don’t accumulate
any more error.

The more I think about it the more I think that the ruby
implementation of BigDecimal is flawed. BigDecimal(‘1’) /
BigDecimal(‘3’) should be stored as a rational (ie as two values, 1
and 3 to be divided). Otherwise it seems to me that it is not ‘proper’
BigDecimal. I have been unable to find documentation that fully
defines what the result of 1/3 should be.

Colin

Colin L. wrote in post #970510:

On 24 December 2010 14:47, Marnen Laibow-Koser [email protected]
wrote:

Talking slight bollocks. I meant that you can’t store with infinite
precision, ie you’ll always be liable for some error.

Of course. You know a better way? Should we just store everything as
Rational?

For things that are decimal numbers (such as money) use Fixed or
BigDecimal and for things that are not decimal then use Float and
determine the errors if it is important.

That’s basically what I’ve been saying, except for “decimal” substitute
“rational”. However, there’s not always an a priori way to determine if
a given field will need to store irrational numbers.

Colin

Best,

Marnen Laibow-Koser
http://www.marnen.org
[email protected]

Sent from my iPhone

Colin L. wrote in post #970536:

The more I think about it the more I think that the ruby
implementation of BigDecimal is flawed. BigDecimal(‘1’) /
BigDecimal(‘3’) should be stored as a rational (ie as two values, 1
and 3 to be divided). Otherwise it seems to me that it is not ‘proper’
BigDecimal. I have been unable to find documentation that fully
defines what the result of 1/3 should be.

I think differently here (but I may be mistaken …).

BigDecimal, for me, as the name implies, guarantees an “exact”
representation of all numbers that are of the form:

[±] (d(n)*10^n + d(n-1)*10^(n-1) + … + d(0)*10^0 + d(-1)*10^(-1) +
… d(-m)*10^(-m))

with d(n), the digit n places to the left of unity and
d(-m), the digit m places to the right of the unit and the comma.
The ^ (caret) represents the “to the power of” function (e.g.
10^3 == 1000).

E.g. in the decimal number “3004.5006”
d(3) = 3
d(2) = 0
d(1) = 0
d(0) = 4
d(-1) = 5
d(-2) = 0
d(-3) = 0
d(-4) = 6

a = BigDecimal.new(“3004.5006”)
=> #BigDecimal:b72706c0,‘0.30045006E4’,8(12)

These numbers form a (mathematical) field for the operators:
+, -, * , but NOT with the division operator.

The nice thing about BigDecimal is that it will automatically
expand it’s precision if needed to make sure the representation remains
correct when using the functions: +, -, *. There is always a finite (and
exact) precision that is sufficient to represent the result of

a [±*] b

Let’s try it:

a = BigDecimal.new(“1000000000000000000”,12)
=> #BigDecimal:b725ae60,‘0.1E19’,4(24)

(precision automatically larger than the 12 I requested)

b = BigDecimal.new(“0.00000000000000003”, 12)
=> #BigDecimal:b72320b4,‘0.3E-16’,4(24)

(precision automatically larger than the 12 I requested)

sum = a+b
=> #<BigDecimal:b722254c,‘0.1000000000 0000000000 0000000000
000003E19’,40(56)>

nice, precision is enlarged to represent the exact result :slight_smile:

difference = a-b
=> #<BigDecimal:b721ee9c,‘0.9999999999 9999999999 9999999999
99997E18’,40(56)>

nice, precision is enlarged to represent the exact result :slight_smile:

product = a*b
=> #BigDecimal:b721c494,‘0.3E2’,4(16)

nice, precision is reduced because more is not required :slight_smile:

But for the division …

division = a/b
=> #<BigDecimal:b721aa7c,‘0.3333333333 3333333333 3333333333 3333333333
3333333E35’,48(56)>

division = a/b

BigDecimal gives it a shot, takes a “large” precision … but in a
“Decimal” representation (Sum of d(i).10^i), there is no way to
represent this exactly. So there is no limited value of precision
that will allow the exact representation. Otherwise said, the function
division ‘/’ is not part of the field, it can produce results that
are outside the set of decimals (even if the 2 operands are inside).

The Decimal numbers are clearly a subset of the Rational numbers.
It would be quite feasible to make a Rational class which is a field
with the operators ‘+’, ‘-’, ‘*’, ‘/’. Of course, the Rational class
exists and does this neatly:

ra = Rational(1000000000000000000000000000,1)
=> Rational(1000000000000000000000000000, 1)

rb = Rational(2,1000000000000000000000000000)
=> Rational(1, 500000000000000000000000000)

ra.to_s
=> “1000000000000000000000000000”

rb.to_s
=> “1/500000000000000000000000000”

(ra+rb).to_s
=>
“500000000000000000000000000000000000000000000000000001/500000000000000000000000000”

(ra-rb).to_s
=>
“499999999999999999999999999999999999999999999999999999/500000000000000000000000000”

(ra*rb).to_s
=> “2”

(ra/rb).to_s
=> “500000000000000000000000000000000000000000000000000000”

(rb/ra).to_s
=> “1/500000000000000000000000000000000000000000000000000000”

ra.inspect
=> “Rational(1000000000000000000000000000, 1)”

rb.inspect
=> “Rational(1, 500000000000000000000000000)”

But then again … sin, sqrt, etc. are not part of the field and
will bring the result outside of the BigRational set into the
set of “Real” numbers.

My conclusion:

  • on the *, +, - operators, it is nice to see BigDecimal
    expand the precision of the results to stick to an
    “exact” representation and stay in the field.
  • if the BigDecimal class sticks to its definition of “Decimal”
    (and not Rational) then behavior of BigDecimal on the
    division operator can be nothing else then arbitrary.

Possible solutions I see on top of my head:
A do nothing (the result of division has a “large” precision,
but is not “exact”)
B raise an exception if the representation becomes not exact
C set an instance variable in the BigDecimal object that marks
it is “not exact” (like the “tainted” flag)
D let BigDecimal return a Rational if the result can not be
exactly represented as a BigDecimal
E use the Rational class if you want exact divisions

They all have their advantages:

  • A OK. Understanding the behaviour is enough. I understand
    now that +.-,* work as expected and / causes the risk
    to loose exact representation.

  • B (could be Class or per object flag)
    ==> in bookkeeping applications, exactness is required,
    so any calculation that destroys the exactness must
    be flagged. So if you calculate the “average” profit
    over your 7 divisions, you would get an exception,
    unless you did something special and understand the risks

  • C this is really unintrusive and allows one to check at the
    end if the exactness did not get lost along the road

  • E this exists, use it if you want an “exact” representation
    of 1/3.

I am not in favor of D because it silently changes the class
of the result and this violates the principle of least surprise.
If you want Rational, use it from the start then.

HTH,

Peter

On 25 December 2010 00:37, Peter V. [email protected]
wrote:

unless you did something special and understand the risks

  • C this is really unintrusive and allows one to check at the
    end if the exactness did not get lost along the road

  • E this exists, use it if you want an “exact” representation
    of 1/3.

I am not in favor of D because it silently changes the class
of the result and this violates the principle of least surprise.
If you want Rational, use it from the start then.

For D there is no reason why the class of the result needs to change
to Rational. It could stay as BigDecimal but internally store the
data as a Rational. Looking at Java BigDecimal it throws an exception
if the result of a division cannot be held exactly.

In summary however I think you are correct for Ruby. BigDecimal will
retain full accuracy except for division, where it is up to the user
to make sure he/she understands what he is doing and ensures that
errors are understood.

Colin

Colin L. wrote in post #970616:

On 25 December 2010 00:37, Peter V. [email protected]
wrote:

I am not in favor of D because it silently changes the class
of the result and this violates the principle of least surprise.
If you want Rational, use it from the start then.

For D there is no reason why the class of the result needs to change
to Rational. It could stay as BigDecimal but internally store the
data as a Rational.

Ah, I see. I had not thought it that way.

Thanks,

Peter

Peter V. wrote in post #970618:

Colin L. wrote in post #970616:

On 25 December 2010 00:37, Peter V. [email protected]
wrote:

I am not in favor of D because it silently changes the class
of the result and this violates the principle of least surprise.
If you want Rational, use it from the start then.

For D there is no reason why the class of the result needs to change
to Rational. It could stay as BigDecimal but internally store the
data as a Rational.

In this case there would be no advantage to storing as a Rational.
Sounds like you want an ExactNumber class that abstracts both.

For the record, I think BigDecimal / BigDecimal = BigDecimal is the
correct design.

Ah, I see. I had not thought it that way.

Thanks,

Peter

Best,

Marnen Laibow-Koser
http://www.marnen.org
[email protected]

Sent from my iPhone

On 25 December 2010 14:24, Marnen Laibow-Koser [email protected]
wrote:

to Rational. It could stay as BigDecimal but internally store the
data as a Rational.

In this case there would be no advantage to storing as a Rational.
Sounds like you want an ExactNumber class that abstracts both.

For the record, I think BigDecimal / BigDecimal = BigDecimal is the
correct design.

In an earlier post I think it was suggested that an advantage of
BigDecimal is that errors do not increase. With divide as it is, and
choosing 16 digit accuracy, then BigDecimal is virtually identical to
Float (assuming ruby Float is actually what would be known as double
in C). Also errors will potentially increase every time a divide
operation is performed. It seems that the default is actually 8 digit
which is only single precision float accuracy.

I see that http://en.wikipedia.org/wiki/Arbitrary-precision_arithmetic
discusses exactly this problem and the use of Rationals to get around
it. It seems to suggest that some languages do that.

Colin

Colin L. wrote in post #970648:

On 25 December 2010 14:24, Marnen Laibow-Koser [email protected]
wrote:

to Rational. It could stay as BigDecimal but internally store the
data as a Rational.

In this case there would be no advantage to storing as a Rational.
Sounds like you want an ExactNumber class that abstracts both.

For the record, I think BigDecimal / BigDecimal = BigDecimal is the
correct design.

In an earlier post I think it was suggested that an advantage of
BigDecimal is that errors do not increase.

Basically.

With divide as it is, and
choosing 16 digit accuracy, then BigDecimal is virtually identical to
Float (assuming ruby Float is actually what would be known as double
in C).

Not at all! You can’t store 397 significant figures in a Float. You
can in a BigDecimal. Apparently you are wilfully ignoring this
advantage, since it has been brought up several times already.

Also errors will potentially increase every time a divide
operation is performed. It seems that the default is actually 8 digit
which is only single precision float accuracy.

So don’t use the default! Sheesh.

I see that http://en.wikipedia.org/wiki/Arbitrary-precision_arithmetic
discusses exactly this problem and the use of Rationals to get around
it. It seems to suggest that some languages do that.

And that would be easy to do in Ruby. I’m glad we have a choice,
though.

Colin

Best,

Marnen Laibow-Koser
http://www.marnen.org
[email protected]

Sent from my iPhone

On 25 December 2010 20:15, Marnen Laibow-Koser [email protected]
wrote:

correct design.

Not at all! You can’t store 397 significant figures in a Float. You
can in a BigDecimal. Apparently you are wilfully ignoring this
advantage, since it has been brought up several times already.

I am not at all saying there is no advantage to BigDecimal, merely
exploring the issues. I entirely agree that for + - * there can be
significant advantages. I am pointing out here that as soon as you
get into more complex calculations involving division, or irrational
numbers such as pi, and trig functions, square roots and so on that
BigDecimal is not a panacea and will not guarantee that there are no
errors in the calculations.

Also errors will potentially increase every time a divide
operation is performed. It seems that the default is actually 8 digit
which is only single precision float accuracy.

So don’t use the default! Sheesh.

Right, again I was pointing out that one cannot blindly assume that
BigDecimal is a panacea, it is necessary to understand ones problem
and use the features appropriately. I am not suggesting you would not
do that, others may not realise the issues however. It does seem
strange to me that the default should be such low precision. It means
that doing complex arithmetic with BigDecimal using the default
precision one may get much greater errors than with Float.

I see that http://en.wikipedia.org/wiki/Arbitrary-precision_arithmetic
discusses exactly this problem and the use of Rationals to get around
it. It seems to suggest that some languages do that.

And that would be easy to do in Ruby. I’m glad we have a choice,
though.

You misunderstand me, I was talking about the possibility of
BigDecimal automatically using Rationals internally when division is
involved (if the division does not result in an exact result). So if
one did BD(1) / BD(3) then the result would be exactly 1/3 and would
be a BigDecimal. Internally it would be a Rational but the user would
not need to know that. The Wikipedia article appeared to suggest that
some languages automatically do that. Again I am not saying Ruby
should do that, merely exploring the issues again.

I am not trying to have an argument here, I am learning a lot myself
by researching the issues and value such discussions greatly.

Did you have a chance to see whether for you the result of BD 1 / BD 3
where each was specified as 16 bit appeared to result in an 8 bit
answer, and if so, why?

Colin

Just a footnote to all of this.

The difference between using a value for pi of 3.1415927 or 3.14159265,
between 7 and 8 digits of precision, when calculating the circumference
of
the earth introduces a difference of less than a meter. All very
reasonable,
but I still found it surprising.

I was taking a differential equations class from a very knowledgeable
math
professor who told us about it. He was also surprised to discover that
such
low precision in pi can lead to such precision in the physical world.

Fred

On Sat, Dec 25, 2010 at 5:30 PM, Marnen Laibow-Koser

Colin L. wrote in post #970667:

On 25 December 2010 20:15, Marnen Laibow-Koser [email protected]
wrote:

correct design.

Not at all! You can’t store 397 significant figures in a Float. You
can in a BigDecimal. Apparently you are wilfully ignoring this
advantage, since it has been brought up several times already.

I am not at all saying there is no advantage to BigDecimal, merely
exploring the issues. I entirely agree that for + - * there can be
significant advantages. I am pointing out here that as soon as you
get into more complex calculations involving division, or irrational
numbers such as pi, and trig functions, square roots and so on that
BigDecimal is not a panacea and will not guarantee that there are no
errors in the calculations.

OK, that I agree with. I thought you were going further.

Also errors will potentially increase every time a divide
operation is performed. It seems that the default is actually 8 digit
which is only single precision float accuracy.

So don’t use the default! Sheesh.

Right, again I was pointing out that one cannot blindly assume that
BigDecimal is a panacea, it is necessary to understand ones problem
and use the features appropriately. I am not suggesting you would not
do that, others may not realise the issues however. It does seem
strange to me that the default should be such low precision.

And to me.

It means
that doing complex arithmetic with BigDecimal using the default
precision one may get much greater errors than with Float.

Right.

I see that http://en.wikipedia.org/wiki/Arbitrary-precision_arithmetic
discusses exactly this problem and the use of Rationals to get around
it. It seems to suggest that some languages do that.

And that would be easy to do in Ruby. I’m glad we have a choice,
though.

You misunderstand me, I was talking about the possibility of
BigDecimal automatically using Rationals internally when division is
involved (if the division does not result in an exact result). So if
one did BD(1) / BD(3) then the result would be exactly 1/3 and would
be a BigDecimal. Internally it would be a Rational but the user would
not need to know that.

Is that really a good idea? This is where I think you’re really arguing
for a further abstraction as I already proposed.

The Wikipedia article appeared to suggest that
some languages automatically do that. Again I am not saying Ruby
should do that, merely exploring the issues again.

I’d want a symbolic math system like Mathematica to do that. I’m not
sure if a general-purpose programming language should.

I am not trying to have an argument here, I am learning a lot myself
by researching the issues and value such discussions greatly.

Me too! This is fascinating.

Did you have a chance to see whether for you the result of BD 1 / BD 3
where each was specified as 16 bit appeared to result in an 8 bit
answer, and if so, why?

Not yet. Even though I’m Jewish, Christmas has me really busy!

Colin

Best,

Marnen Laibow-Koser
http://www.marnen.org
[email protected]

Sent from my iPhone