Float Points error

THAKUR_PRASHANT_SIN · January 15, 2010, 5:57am

Hi,

I am facing some issues in dealing with floating point numbers
I am picking floating point numbers form MySQL DB for example 22.09

And similar numbers and store them in an array in ruby.

When I collect all the numbers I want to filter them in groups.

Suppose 22.09 was the original number and its stored in ruby array
myArray at some index.

While retrieving from array I get some number like 22.09000015.

So when I do groupings I miss out my original number as 22.09 <
22.09000015

Where as it should come in a group where numbers are 22.09 >= 22.09

I wish to have last condition returning true.Is there any simple way to
do that as my dataset contains around a million numbers.

If I use conversion to string and back to to_f I lose time of
approximately 10 secs for each 7000 rows.

Regards,

Prashant

THAKUR_PRASHANT_SIN · January 15, 2010, 6:32am

On 15.01.2010 05:56, THAKUR PRASHANT SINGH wrote:

While retrieving from array I get some number like 22.09000015.

So when I do groupings I miss out my original number as 22.09<
22.09000015

Where as it should come in a group where numbers are 22.09>= 22.09

Floats aren’t exact, they are approximations: IEEE 754-2008
http://en.wikipedia.org/wiki/IEEE_754-2008

I wish to have last condition returning true.Is there any simple way to
do that as my dataset contains around a million numbers.

Short of implementing a #rounding method (1.8.6 Core does only integer
rounding, after a curious check; STDLIB, 1.8.7 and 1.9.1 might have a
different behavior) to the accuracy you need: No.

Or you could implement a custom comparison operator that does what you
need.

If I use conversion to string and back to to_f I lose time of
approximately 10 secs for each 7000 rows.

You’ll loose time when computing with Floats, anyway, since outside of
vector-based CPUs (IIRC), the performance of performing maths on floats
is terrible (in relative terms).

THAKUR_PRASHANT_SIN · January 15, 2010, 9:01am

On 2010-01-15, Phillip G. [email protected] wrote:

You’ll loose time when computing with Floats, anyway, since outside of
vector-based CPUs (IIRC), the performance of performing maths on floats
is terrible (in relative terms).

This is nowhere near as true as it used to be. Some quickie tests on an
x86
revealed that, for some common operations, floating point math was
FASTER than
integer! It depends a lot on what you’re trying to do. Since modernish
CPUs
may well be able to do both float and integer operations simultaneously,
an
integer loop of floating point operations may be very fast…

… Of course, that doesn’t necessarily tell you much about Ruby
floats,
where I’d expect FixNum to beat Float by a huge margin.

-s

THAKUR_PRASHANT_SIN · January 15, 2010, 10:01pm

On Jan 15, 2010, at 12:32 AM, Phillip G. wrote:

On 15.01.2010 05:56, THAKUR PRASHANT SINGH wrote:

While retrieving from array I get some number like 22.09000015.

So when I do groupings I miss out my original number as 22.09<
22.09000015

Where as it should come in a group where numbers are 22.09>= 22.09

Floats aren’t exact, they are approximations: IEEE 754-2008 http://en.wikipedia.org/wiki/IEEE_754-2008

I see it is time for the Floating Point 101 thread again.
Obligatory reference:
http://docs.sun.com/source/806-3568/ncg_goldberg.html

It is true that floats are approximations of real numbers but the issue
that most often trips people up is the conversion between the internal
binary representation and the external decimal representation. More
specifically, by default Ruby doesn’t show you the full precision
available in a float:

1/3.0
=> 0.333333333333333
“%0.60f” % (1/3.0)
=> “0.333333333333333314829616256247390992939472198486328125000000”

Lots of discussion in the archives (about once per month I think, maybe
more often).

Gary W.

THAKUR_PRASHANT_SIN · January 15, 2010, 9:09am

On 15.01.2010 09:00, Seebs wrote:

On 2010-01-15, Phillip G.[email protected] wrote:

You’ll loose time when computing with Floats, anyway, since outside of
vector-based CPUs (IIRC), the performance of performing maths on floats
is terrible (in relative terms).

This is nowhere near as true as it used to be. Some quickie tests on an x86
revealed that, for some common operations, floating point math was FASTER than
integer! It depends a lot on what you’re trying to do. Since modernish CPUs
may well be able to do both float and integer operations simultaneously, an
integer loop of floating point operations may be very fast…

Possibly. If the multi-core nature of today’s processors lends itself to
crunching Floats well, the issue is becoming somewhat moot indeed.

To be honest, though, I’m personally not that worried about
computational performance in that area.

And if Ruby is the bottle neck, it’d be possible to drop down to C, and
write an extension to handle the Float operations (or drop Ruby
entirely, if speed of computation is more important than speed of
development, for example).

THAKUR_PRASHANT_SIN · January 15, 2010, 11:30pm

Gary W. wrote:

I see it is time for the Floating Point 101 thread again.
[…]
Lots of discussion in the archives (about once per month I think, maybe more often).

Actually, I believe that the last two(!) threads are still running.
So, we have now gotten to the point where the exact same question
gets asked faster than it can be answered.

Fascinating.

jwm

THAKUR_PRASHANT_SIN · January 16, 2010, 1:38pm

On Fri, Jan 15, 2010 at 8:08 AM, Phillip G. [email protected]
wrote:

And if Ruby is the bottle neck, it’d be possible to drop down to C, and
write an extension to handle the Float operations (or drop Ruby entirely, if
speed of computation is more important than speed of development, for
example).

I think it’s worth reminding people that it’s possible to use Java with
JRuby.
On an admittedly low-spec laptop I’ve found that for heavy numeric
calculations
(for example calculating CRCs of large files, and for large floating
point calcs)
using JRuby to call Java gives similar performance to compiled
FreePascal.
The MRI Zlib.crc32 is written in C (that is correct?) and albeit that is
integer arithmetic, again I’ve found that a JRuby/Java equivalent
has a similar speed.
I’d be interested in any speed tests comparing heavy floating point
calculations
with MRI + C extensions and JRuby with Java.

THAKUR_PRASHANT_SIN · January 17, 2010, 11:05pm

On Sat, Jan 16, 2010 at 6:37 AM, Colin B.
[email protected] wrote:

I think it’s worth reminding people that it’s possible to use Java with JRuby.
On an admittedly low-spec laptop I’ve found that for heavy numeric calculations
(for example calculating CRCs of large files, and for large floating
point calcs)
using JRuby to call Java gives similar performance to compiled FreePascal.
The MRI Zlib.crc32 is written in C (that is correct?) and albeit that is
integer arithmetic, again I’ve found that a JRuby/Java equivalent
has a similar speed.
I’d be interested in any speed tests comparing heavy floating point calculations
with MRI + C extensions and JRuby with Java.

You may as well remove MRI and JRuby from that and look at the many
benchmarks comparing Java and C numeric algorithm performance. In most
cases, you can get close to or equivalent performance from
appropriately-written Java numeric algorithm code (which may or may
not be idiomatic Java code). If you start doing things more OO, then
a better comparison would be equivalent OO-like code in C++, and Java
is still about equivalent.

The cost of calling from (J)Ruby out to Java is probably a bit higher
than calling from Ruby to C extensions, but if you keep the number of
call-outs under control, it’s not going to impact application
performance.

And it’s worth pointing out that writing portable Java code is a hell
of a lot easier than writing portable C (especially for backend code
that doesn’t have user-facing UI components); you don’t even need a
compiler on most target systems, and the same code will run anywhere
there’s a Java VM (which is basically anywhere).

If you’d rather not futz with compilers and you’re running on Java 6,
there’s a few useful projects:

My java-inline project (based on RubyInline), which allows embedding
the Java code directly into your Ruby script:
http://github.com/jruby/java-inline
My “Duby” language, which attempts to use Ruby syntax to produce
Java code and bytecode without writing in Java:
GitHub - headius/duby: The Duby Programming Language
My BiteScript project, which is a DSL for generating JVM bytecode
directly: GitHub - headius/bitescript: The BiteScript API and language
There’s also a scala-inline which I will (if I haven’t already)
merge into java_inline, allowing you to embed Scala directly into a
Ruby script.

All of which can run on any system with a JVM, with or without a
compiler present.

Charlie

THAKUR_PRASHANT_SIN · January 17, 2010, 11:46pm

FWIW, I’ve also just released java-inline 0.0.3, which adds Duby
support:

{code}
require ‘rubygems’
require ‘duby_inline’
require ‘benchmark’

class FastMath
def fib_ruby(n)
if n < 2
n
else
fib_ruby(n - 2) + fib_ruby(n - 1)
end
end

inline :Duby do |builder|
builder.duby "
def fib_duby(n:int)
if n < 2
n
else
fib_duby(n - 2) + fib_duby(n - 1)
end
end
"
end
end
{/code}

                                user     system      total

real
fib_ruby(35) 0.306000 0.000000 0.306000 (
0.306000)
fib_ruby(35) 0.302000 0.000000 0.302000 (
0.302000)
fib_ruby(35) 0.302000 0.000000 0.302000 (
0.302000)
fib_ruby(35) 0.302000 0.000000 0.302000 (
0.302000)
fib_ruby(35) 0.303000 0.000000 0.303000 (
0.303000)
fib_duby(35) 0.007000 0.000000 0.007000 (
0.007000)
fib_duby(35) 0.007000 0.000000 0.007000 (
0.007000)
fib_duby(35) 0.007000 0.000000 0.007000 (
0.007000)
fib_duby(35) 0.008000 0.000000 0.008000 (
0.008000)
fib_duby(35) 0.009000 0.000000 0.009000 (
0.008000)

And yes, I appreciate that this looks a little silly since the Duby
code is basically the same as the Ruby code. But hey, that’s how the
_inline thing works. I am interested in adding native support for Duby
(or for localized static typing) to JRuby in the future.

Charlie

On Sun, Jan 17, 2010 at 4:04 PM, Charles Oliver N.

THAKUR_PRASHANT_SIN · January 17, 2010, 11:54pm

Oops, those numbers were actually fib(30). Here’s fib(35) numbers:

fib_ruby(35) 3.192000 0.000000 3.192000 (
3.192000)
fib_ruby(35) 2.992000 0.000000 2.992000 (
2.992000)
fib_ruby(35) 3.009000 0.000000 3.009000 (
3.009000)
fib_ruby(35) 3.001000 0.000000 3.001000 (
3.001000)
fib_ruby(35) 2.988000 0.000000 2.988000 (
2.988000)
fib_duby(35) 0.082000 0.000000 0.082000 (
0.081000)
fib_duby(35) 0.077000 0.000000 0.077000 (
0.077000)
fib_duby(35) 0.078000 0.000000 0.078000 (
0.078000)
fib_duby(35) 0.078000 0.000000 0.078000 (
0.078000)
fib_duby(35) 0.078000 0.000000 0.078000 (
0.078000)

On Sun, Jan 17, 2010 at 4:45 PM, Charles Oliver N.

THAKUR_PRASHANT_SIN · January 19, 2010, 6:26pm

It is true that floats are approximations of real numbers but the issue
that most often trips people up is the conversion between the internal
binary representation and the external decimal representation. More
specifically, by default Ruby doesn’t show you the full precision
available in a float:

1/3.0
=> 0.333333333333333

“%0.60f” % (1/3.0)
=> “0.333333333333333314829616256247390992939472198486328125000000”

So currently ruby “doesn’t show you the real floating point number”
which leads to surprising behavior, for example

(2.0 - 1.1) == 0.9
=> false

This has been changed in the latest dev trunk, to be more explicit
(Float#to_s is totally explicit now).

2.0 - 1.1
=> 0.8999999999999999

0.9
=> 0.9

2.0 - 1.1 == 0.9
=> false

Unfortunately this means that writing out to the screen used to be

puts 2.0 - 1.1
0.9

but is now

puts 2.0 - 1.1
0.8999999999999999

I have a proposal out to make Float#to_s be “the imprecise [old] way”
and Float#inspect to be the new, precise way (split the
functionality–it’s currently the same).
Any feedback/preferences on this before I do anything in that regard?
-r

THAKUR_PRASHANT_SIN · January 19, 2010, 9:39pm

Benoit D. wrote:

2010/1/19 Roger P. [email protected]

I have a proposal out to make Float#to_s be “the imprecise [old] way”
and Float#inspect to be the new, precise way (split the
functionality–it’s currently the same).

I surely agree !

You could add a +1 here:

http://redmine.ruby-lang.org/issues/show/2152

But, more directly, the fact that people would like it that way is more
motivation for me to come up with a concrete patch for it and start
pinging core

-r

THAKUR_PRASHANT_SIN · January 19, 2010, 9:21pm

2010/1/19 Roger P. [email protected]

I have a proposal out to make Float#to_s be “the imprecise [old] way”
and Float#inspect to be the new, precise way (split the
functionality–it’s currently the same).

I surely agree !
This is the opinion I said on redmine, and 2 threads. I think most
people
would agree, but there was apparently no discution to this precise
opinion.
Maybe we should open one ? as a feature, an imporvement ?