Bizarre Floating point errors in Ruby? Serious bug?

unknown · November 27, 2007, 1:32am

Hi,

I’ve come across a strange bug in ruby (running 1.8.6 on Linux, but
confirmed also in 1.8.5 on an older Mac).

(-140.14 * 100).to_i
=> -14013

The desired result is, obviously, -14014. Strangely enough:

(-140.14 * 100)
=> -14014.0

And:

k = (-140.14 * 100)
=> -14014.0

k.to_i
=> -14013

HOWEVER…

-14014.0.to_i
=> -14014

Is this a strange behaviour or what?

Workaround is like this:

k.to_s.to_i
=> -14014

Can someone please confirm this strange behaviour?.. I know it is
also happening for some other numbers too:

(0…1000).each do |n|
n = n.to_f + (n.to_f / 100)
k = (n * 100)
if k != k.to_s.to_f
puts “K does not equal itself!? #{k} != #{k.to_s.to_f}”
end

if k.to_i != k.to_s.to_i
puts “Error with k = #{k}!? #{k.to_i} != #{k.to_s.to_i}”
end
end

I know a lot about floating point numbers, but this is really bizarre
behaviour.

Expected behaviour would be for no errors in the above test example. I
don’t expect floating point to be accurate (this is obvious), but I do
expect floating point to be consistent (whole number floating point is
guaranteed to be accurate with IEEE floating point standard right??).

Thanks to anyone who can test this out.

Kind Regards,
Samuel W.

unknown · November 27, 2007, 1:52am

Samuel,

Nothing is amiss. You’re just not interpreting the floating point
correctly.

On 11/26/07, [email protected]
[email protected] wrote:

Hi,

I’ve come across a strange bug in ruby (running 1.8.6 on Linux, but
confirmed also in 1.8.5 on an older Mac).

(-140.14 * 100).to_i
=> -14013

Not strange at all, look at this:

puts “%.16f” % (-140.14*100)
-14013.9999999999981810

So, when you truncate with the #to_i, you get -14013 – perfectly
consistent and logical.

Cameron

unknown · November 27, 2007, 1:56am

[email protected] wrote:

(-140.14 * 100)
=> -14014.0

This might explain it:

irb(main):007:0> sprintf("%0.12f" % (-140.14 * 100))
=> “-14013.999999999998”

=> -14014

Is this a strange behaviour or what?

Workaround is like this:

k.to_s.to_i
=> -14014
Or:

irb(main):014:0> (-140.14 * 100).round
=> -14014

expect floating point to be consistent (whole number floating point is
guaranteed to be accurate with IEEE floating point standard right??).
The product of two whole numbers isn’t, though. Apparently. It’s been
a while since I knew lots about floating point representation, but
that’s what this test case is telling me.

unknown · November 27, 2007, 2:16am

Ah, that makes sense. I kind of suspected it may be something like
this. However, it seems awfully strange that this is a good default
behaviour (from IRB, or Ruby in general)?

For example, if I do the same in python:

(-140.14*100)
-14013.999999999998

vs Ruby:

irb(main):005:0> (-140.14*100)
=> -14014.0

I think that Python is much better in this situation - -14014.0 is
obviously not the correct value of the floating point number.

Even PHP (!) has more “reliable” behaviour in this case (Possibly -
looks as if the return type is getting formatted as an int, don’t
quote me on PHP being reliable… ^_^):

<? echo (-140.14*100.0) ?>

-14014

To be honest, I am not sure what is desirable behaviour. But… I can
say that Python behaviour in this case is much clearer… to say “=>
-14014.0” is not clear at all, at the very least. The number with xyz.
0 indicates in mathematics that a number is accurate to one d.p., no?
If the number is actually xyz.abc - then we should be aware of that,
especially from something such as IRB where people debug algorithms,
or puts, often also used for algorithms…

This also causes special case behaviour… when you use .to_s we get a
different value completely - i.e. if we write k to a file, then read
it again, to_i will behave differently… is this a good semantic to
have in place?

Regards,
Samuel

unknown · November 27, 2007, 2:40am

On 11/26/07, [email protected]
[email protected] wrote:

irb(main):005:0> (-140.14*100)
=> -14014.0

This has nothing to do with the languages themselves. Both python and
ruby will agree. The difference is completely with how it’s output
via the prompt. Try some of the following:

irb(main):001:0> “%g” % (-140.14100)
=> “-14014”
irb(main):002:0> “%f” % (-140.14100)
=> “-14014.000000”
irb(main):003:0> “%.10f” % (-140.14100)
=> “-14014.0000000000”
irb(main):004:0> “%.16f” % (-140.14100)
=> “-14013.9999999999981810”

printed format is NOT the same as internal format.

I think that Python is much better in this situation - -14014.0 is
obviously not the correct value of the floating point number.

Actually, they are the same according to double precision.

0 indicates in mathematics that a number is accurate to one d.p., no?
If the number is actually xyz.abc - then we should be aware of that,
especially from something such as IRB where people debug algorithms,
or puts, often also used for algorithms…

This also causes special case behaviour… when you use .to_s we get a
different value completely - i.e. if we write k to a file, then read
it again, to_i will behave differently… is this a good semantic to
have in place?

This is known, and well discussed and misunderstood in many, many,
many places (I’m talking well outside of ruby here). You can also
create this problem in C. An ascii formatted number is not going to
be the same as the machine representation. It’s sometimes a reason
people use a binary format.

In any case, you confusion is understandable - but there isn’t
anything inconsistent or problematic here. It’s just a rehashing of
the standard floating point representation.

Cameron

unknown · November 27, 2007, 3:20am

Cameron
I understand this is not a problem for just Ruby, but in this case the
Ruby behaviour is not as obvious as Python. I am going to guess that
it is IRB doing the formatting, so it may be something that can be
changed in IRB.

When you are debugging something, it is important that you see the
actual data. For example, I can forgive puts for writing “-14014.0”,
but I can’t forgive IRB for printing that as the value of a variable.
Even in GDB, we will not get this kind of result

— test.cpp —
#include

int main (int argc, char ** argv) {
double v = (-140.14 * 100.0);

std::cout << v << std::endl;
}

(gdb) break test.cpp:7
Breakpoint 1 at 0x1da8: file test.cpp, line 7.
(gdb) run
Starting program: /private/tmp/test
Reading symbols for shared libraries +++. done

Breakpoint 1, main (argc=1, argv=0xbffff9dc) at test.cpp:7
7 std::cout << v << std::endl;
(gdb) p v
$1 = -14013.999999999998 <=== This line is like “=> v”
(gdb) step
-14014 <=== This line is like “=> puts v”
8 }
(gdb)

So does this make clear what my issue is with this kind of formatting?
For example, in this situation, the debugger correctly shows me the
value, this is what I would expect:

(-140.14 * 100)
=> -14013.999999999998 <=== Great!

We know the number isn’t -14014.0 - which is mathematically incorrect

this is the kind of information we need to know in a debugger.

puts (-140.14 * 100)
=> -14014.0 <=== Acceptable

Having .0 on the output is not really a good default. The output
should be -14014 in this case, without any trailing .0 - as it stands,
this indicates that that number is accurate to 1dp. If numbers are
going to be formatted and rounded by default, best to do it correctly,
right?

Obviously, string representation is not accurate, I’m not stupid
enough to dispute that! However, I think it is important these things
are done consistently for the benefit of the programmer. Both GDB and
Python, and many other languages are consistently different from Ruby
in this respect.

I understand that you are trying to tell me that the number is the
same - I’m not arguing that, what I am arguing is that the way this is
revealed to the programmer is a problem.

Regards,
Samuel

unknown · November 27, 2007, 8:50am

=> -14013.999999999998

Regards,
Jordan

Thanks for this useful information. After further investigation,
Python also has the same problem with its pretty printing:

print (-140.14 * 100)
-14014.0

Well, actually this is the correct answer… i.e. the result we are
looking for if we were using real math. But, it is not the correct
way to round 14013.999999, which is the actual value that we get with
floating point math… so I guess it is hard to pick what is the ideal
behaviour… both have there pros and cons

Rounding 9.9 to 10.0 is like rounding 99 to 100, but we consider that
only 1 significant figure is important in 100, ie 1xx. So, we are left
with 10.0 with 1SF, but .0 conveys the idea of 1DP correctness, which
is definitely not correct.

I’d need to consult a mathematician, but I still don’t think the Ruby
behaviour is correct, mathematically. I’ll post here again once I have
more information about the mathematical correctness of this issue.

Thanks for your great comments,
Samuel

unknown · November 27, 2007, 3:58am

On Nov 26, 8:19 pm, [email protected] wrote:

I understand this is not a problem for just Ruby, but in this case the
Ruby behaviour is not as obvious as Python. I am going to guess that
it is IRB doing the formatting, so it may be something that can be
changed in IRB.

Float#to_s is doing the formatting. If you want it like python, do it
like this:

class Float
def to_s
“%.12f” % self
end
end

(-140.14 * 100)

=> -14013.999999999998

Regards,
Jordan

unknown · November 27, 2007, 10:38am

[email protected] wrote:

=> -14013.999999999998

Well, actually this is the correct answer… i.e. the result we are
I’d need to consult a mathematician, but I still don’t think the Ruby
behaviour is correct, mathematically. I’ll post here again once I have
more information about the mathematical correctness of this issue.

#to_i doesn’t round, it truncates. If you want to round, use #round.

unknown · November 27, 2007, 9:59am

2007/11/27, [email protected]
[email protected]:

(-140.14 * 100)

I’d need to consult a mathematician, but I still don’t think the Ruby
behaviour is correct, mathematically. I’ll post here again once I have
more information about the mathematical correctness of this issue.

You can save yourself the effort. No computational machine that uses
float math can guarantee to be “mathematical correct” in all
situations. The reason is fairly simple: machines have limited
resources to represent numbers while in math there are a lot of real
numbers around that cannot be represented with finite resources with
pi and e only being the most famous ones. q.e.d.

In your case however BigDecimal is sufficient:

$ irb -r bigdecimal
irb(main):001:0> a=BigDecimal.new ‘-140.14’
=> #BigDecimal:7ff96e2c,‘-0.14014E3’,8(12)
irb(main):002:0> a100
=> #BigDecimal:7ff92034,‘-0.14014E5’,8(20)
irb(main):003:0> (a100).to_i
=> -14014
irb(main):004:0> (a*100).to_f
=> -14014.0

Kind regards

robert

unknown · November 28, 2007, 10:22pm

On Thu, 29 Nov 2007 06:03:21 +0900, “Shot (Piotr S.)”
[email protected] wrote:

Why isnâ€™t it? I might have learned basic math years ago, but to
me -14013.999999 rounded to one decimal place is exactly -14014.0
(itâ€™s also -14014.00 when rounded to two decimal places, as well as
-14014 when rounded to integers).

This comes up a lot, and I guess the thing that most people have a
hard time wrapping their heads around here is the fact that floating
point numbers don’t obey the laws of basic math – not exactly.

Floating point numbers are an approximation of real numbers. That’s
all. We’re stuck with them because it isn’t physically possible to
represent real numbers in hardware, and floating point is one of
the few ways to approximate them efficiently.

And it is a hardware limitation – you’ll see the same behavior in
languages like C or Java which also rely on hardware support for
computations with “decimal” numbers.

-mental

unknown · November 29, 2007, 4:49am

On Nov 28, 2007, at 4:20 PM, MenTaLguY wrote:

This comes up a lot, and I guess the thing that most people have a
hard time wrapping their heads around here is the fact that floating
point numbers don’t obey the laws of basic math – not exactly.

That and the fact that internally numbers are represented in base 2
and externally they are represented in base 10. Unfortunately there
is not a one-to-one mapping between the two representations leading
to inexact conversions and confusion. If evolution had given us four
fingers on each hand and we counted in base 8 then maybe we could
have avoided these problems.

Gary W.

unknown · November 28, 2007, 10:05pm

[email protected]:

print (-140.14 * 100)
-14014.0

Well, actually this is the correct answer… i.e. the result we are
looking for if we were using real math. But, it is not the correct
way to round 14013.999999, which is the actual value that we get with
floating point math…

Why isnâ€™t it? I might have learned basic math years ago, but to
me -14013.999999 rounded to one decimal place is exactly -14014.0
(itâ€™s also -14014.00 when rounded to two decimal places, as well as
-14014 when rounded to integers).

Rounding 9.9 to 10.0 is like rounding 99 to 100

Yeah, but youâ€™re not rounding 9.9 to 10.0 here, youâ€™re rouding
9.99 to 10.0 (which is like rounding 99.9 to 100, i.e., correct).

– Shot (with a random sig, no less)

unknown · November 29, 2007, 5:36am

On Nov 28, 9:49 pm, Gary W. [email protected] wrote:

fingers on each hand and we counted in base 8 then maybe we could
have avoided these problems.

Gary W.

I think the OP understands about bases (if I may presume a bit here!).
What I think was a suprise to him was that the string representation
not only rounds, but presents a trailing “.0”, as if it were accurate
to within one magnitude of 10. That is pretty much an arbitrary call,
as has been mentioned; and as the to_s method can be overridden in
ruby, it’s not a problem at all. It’s just a suprise if you’re
expecting some (arbitrary) percision for the string representation
(e.g., the same percision used in GDB).

Regards,
Jordan

unknown · November 30, 2007, 5:09pm

MenTaLguY:

On Thu, 29 Nov 2007 06:03:21 +0900, “Shot (Piotr S.)” [email protected] wrote:

[email protected]:

print (-140.14 * 100)
-14014.0

Well, actually this is the correct answer… i.e. the result we are
looking for if we were using real math. But, it is not the correct
way to round 14013.999999, which is the actual value that we get with
floating point math…

Why isnâ€™t it? I might have learned basic math years ago, but to
me -14013.999999 rounded to one decimal place is exactly -14014.0
(itâ€™s also -14014.00 when rounded to two decimal places, as well as
-14014 when rounded to integers).

This comes up a lot, and I guess the thing that most people have
a hard time wrapping their heads around here is the fact that floating
point numbers don’t obey the laws of basic math – not exactly.

Just to clarify â€“ I wasnâ€™t writing anyhing about the ramifications of
IEEE 754, I was just pointing out that -14014.0 is the proper rounded
version of -14013.999999, so thereâ€™s no bug in how Ruby rounds the
IEEE 754 floats.

For anyone finding this thread in their pursuit of â€˜but why Ruby
floats seem broken sometimes?â€™, a good starting point might be
this Wikipedia entry: IEEE 754-1985 - Wikipedia

– Shot (â€˜thatâ€™s no bug, itâ€™s a spac^H^H^H^H^Hn IEEE float!â€™)

unknown · November 30, 2007, 5:59pm

On 30/11/2007, Shot (Piotr S.) [email protected] wrote:

–
<Black_Dog> “^$[^ ()]+$$([0-9]+$,$[0-9]+$)”
<Black_Dog> gotta love regexps
<Bl1tz|work> it looks like some elaborate Japanese smiley
<Bl1tz|work> like “your parents just found out you’ve been
slacking in class and you also have the flu”

ROTFL

where do you get the sigs ?