Forum: Ruby Float Points error

Posted by THAKUR PRASHANT SINGH (Guest)
on 2010-01-15 05:57
(Received via mailing list)
Hi,



I am facing some issues in dealing with floating point numbers
I am picking floating point numbers form MySQL DB for example 22.09

And similar numbers and store them in an array in ruby.

When I collect all the numbers I want to filter them in groups.

Suppose 22.09 was the original number and its stored in ruby array
myArray at some index.

While retrieving from array I get some number like 22.09000015.

So when I do groupings I miss out my original number as 22.09 <
22.09000015

Where as it should come in a group where numbers are 22.09 >= 22.09

I wish to have last condition returning true.Is there any simple way to
do that as my dataset contains around a million numbers.

If I use conversion to string and back to to_f I lose time of
approximately 10 secs for each 7000 rows.



Regards,

Prashant
Posted by Phillip Gawlowski (Guest)
on 2010-01-15 06:32
(Received via mailing list)
On 15.01.2010 05:56, THAKUR PRASHANT SINGH wrote:

> While retrieving from array I get some number like 22.09000015.
>
> So when I do groupings I miss out my original number as 22.09<
> 22.09000015
>
> Where as it should come in a group where numbers are 22.09>= 22.09

Floats aren't exact, they are approximations: IEEE 754-2008
<http://en.wikipedia.org/wiki/IEEE_754-2008>

> I wish to have last condition returning true.Is there any simple way to
> do that as my dataset contains around a million numbers.

Short of implementing a #rounding method (1.8.6 Core does only integer
rounding, after a curious check; STDLIB, 1.8.7 and 1.9.1 might have a
different behavior) to the accuracy you need: No.

Or you could implement a custom comparison operator that does what you 
need.

> If I use conversion to string and back to to_f I lose time of
> approximately 10 secs for each 7000 rows.

You'll loose time when computing with Floats, anyway, since outside of
vector-based CPUs (IIRC), the performance of performing maths on floats
is terrible (in relative terms).
Posted by Seebs (Guest)
on 2010-01-15 09:01
(Received via mailing list)
On 2010-01-15, Phillip Gawlowski <pg@thimian.com> wrote:
> You'll loose time when computing with Floats, anyway, since outside of 
> vector-based CPUs (IIRC), the performance of performing maths on floats 
> is terrible (in relative terms).

This is nowhere near as true as it used to be.  Some quickie tests on an 
x86
revealed that, for some common operations, floating point math was 
FASTER than
integer!  It depends a lot on what you're trying to do.  Since modernish 
CPUs
may well be able to do both float and integer operations simultaneously, 
an
integer loop of floating point operations may be very fast...

... Of course, that doesn't necessarily tell you much about *Ruby* 
floats,
where I'd expect FixNum to beat Float by a huge margin.

-s
Posted by Phillip Gawlowski (Guest)
on 2010-01-15 09:09
(Received via mailing list)
On 15.01.2010 09:00, Seebs wrote:
> On 2010-01-15, Phillip Gawlowski<pg@thimian.com>  wrote:
>> You'll loose time when computing with Floats, anyway, since outside of
>> vector-based CPUs (IIRC), the performance of performing maths on floats
>> is terrible (in relative terms).
>
> This is nowhere near as true as it used to be.  Some quickie tests on an x86
> revealed that, for some common operations, floating point math was FASTER than
> integer!  It depends a lot on what you're trying to do.  Since modernish CPUs
> may well be able to do both float and integer operations simultaneously, an
> integer loop of floating point operations may be very fast...

Possibly. If the multi-core nature of today's processors lends itself to
crunching Floats well, the issue is becoming somewhat moot indeed.

To be honest, though, I'm personally not that worried about
computational performance in that area.

And if Ruby is the bottle neck, it'd be possible to drop down to C, and
write an extension to handle the Float operations (or drop Ruby
entirely, if speed of computation is more important than speed of
development, for example).
Posted by Gary Wright (Guest)
on 2010-01-15 22:01
(Received via mailing list)
On Jan 15, 2010, at 12:32 AM, Phillip Gawlowski wrote:

> On 15.01.2010 05:56, THAKUR PRASHANT SINGH wrote:
> 
>> While retrieving from array I get some number like 22.09000015.
>> 
>> So when I do groupings I miss out my original number as 22.09<
>> 22.09000015
>> 
>> Where as it should come in a group where numbers are 22.09>= 22.09
> 
> Floats aren't exact, they are approximations: IEEE 754-2008 <http://en.wikipedia.org/wiki/IEEE_754-2008>

I see it is time for the Floating Point 101 thread again.
Obligatory reference: 
<http://docs.sun.com/source/806-3568/ncg_goldberg.html>

It is true that floats are approximations of real numbers but the issue 
that most often trips people up is the conversion between the internal 
binary representation and the external decimal representation.  More 
specifically, by default Ruby doesn't show you the full precision 
available in a float:

>> 1/3.0
=> 0.333333333333333
>> "%0.60f" % (1/3.0)
=> "0.333333333333333314829616256247390992939472198486328125000000"

Lots of discussion in the archives (about once per month I think, maybe 
more often).

Gary Wright
Posted by Jörg W Mittag (Guest)
on 2010-01-15 23:30
(Received via mailing list)
Gary Wright wrote:
> I see it is time for the Floating Point 101 thread again.
[...]
> Lots of discussion in the archives (about once per month I think, maybe more often).

Actually, I believe that the last two(!) threads are still running.
So, we have now gotten to the point where the *exact same* question
gets asked faster than it can be answered.

Fascinating.

jwm
Posted by Colin Bartlett (Guest)
on 2010-01-16 13:38
(Received via mailing list)
On Fri, Jan 15, 2010 at 8:08 AM, Phillip Gawlowski <pg@thimian.com> 
wrote:
> And if Ruby is the bottle neck, it'd be possible to drop down to C, and
> write an extension to handle the Float operations (or drop Ruby entirely, if
> speed of computation is more important than speed of development, for
> example).

I think it's worth reminding people that it's possible to use Java with 
JRuby.
On an admittedly low-spec laptop I've found that for heavy numeric 
calculations
(for example calculating CRCs of large files, and for large floating
point calcs)
using JRuby to call Java gives similar performance to compiled 
FreePascal.
The MRI Zlib.crc32 is written in C (that is correct?) and albeit that is
integer arithmetic, again I've found that a JRuby/Java equivalent
has a similar speed.
I'd be interested in any speed tests comparing heavy floating point 
calculations
with MRI + C extensions and JRuby with Java.
Posted by Charles Nutter (headius)
on 2010-01-17 23:05
(Received via mailing list)
On Sat, Jan 16, 2010 at 6:37 AM, Colin Bartlett 
<colinb2r@googlemail.com> wrote:
> I think it's worth reminding people that it's possible to use Java with JRuby.
> On an admittedly low-spec laptop I've found that for heavy numeric calculations
> (for example calculating CRCs of large files, and for large floating
> point calcs)
> using JRuby to call Java gives similar performance to compiled FreePascal.
> The MRI Zlib.crc32 is written in C (that is correct?) and albeit that is
> integer arithmetic, again I've found that a JRuby/Java equivalent
> has a similar speed.
> I'd be interested in any speed tests comparing heavy floating point calculations
> with MRI + C extensions and JRuby with Java.

You may as well remove MRI and JRuby from that and look at the many
benchmarks comparing Java and C numeric algorithm performance. In most
cases, you can get close to or equivalent performance from
appropriately-written Java numeric algorithm code (which may or may
not be *idiomatic* Java code). If you start doing things more OO, then
a better comparison would be equivalent OO-like code in C++, and Java
is still about equivalent.

The cost of calling from (J)Ruby out to Java is probably a bit higher
than calling from Ruby to C extensions, but if you keep the number of
call-outs under control, it's not going to impact application
performance.

And it's worth pointing out that writing portable Java code is a hell
of a lot easier than writing portable C (especially for backend code
that doesn't have user-facing UI components); you don't even need a
compiler on most target systems, and the same code will run anywhere
there's a Java VM (which is basically anywhere).

If you'd rather not futz with compilers and you're running on Java 6,
there's a few useful projects:

* My java-inline project (based on RubyInline), which allows embedding
the Java code directly into your Ruby script:
http://github.com/jruby/java-inline
* My "Duby" language, which attempts to use Ruby syntax to produce
Java code and bytecode without writing in Java:
http://github.com/headius/duby
* My BiteScript project, which is a DSL for generating JVM bytecode
directly: http://github.com/headius/bitescript
* There's also a scala-inline which I will (if I haven't already)
merge into java_inline, allowing you to embed Scala directly into a
Ruby script.

All of which can run on any system with a JVM, with or without a
compiler present.

- Charlie
Posted by Charles Nutter (headius)
on 2010-01-17 23:46
(Received via mailing list)
FWIW, I've also just released java-inline 0.0.3, which adds Duby 
support:

{code}
require 'rubygems'
require 'duby_inline'
require 'benchmark'

class FastMath
  def fib_ruby(n)
    if n < 2
      n
    else
      fib_ruby(n - 2) + fib_ruby(n - 1)
    end
  end

  inline :Duby do |builder|
    builder.duby "
      def fib_duby(n:int)
        if n < 2
          n
        else
          fib_duby(n - 2) + fib_duby(n - 1)
        end
      end
      "
  end
end
{/code}

                                    user     system      total 
real
fib_ruby(35)                    0.306000   0.000000   0.306000 ( 
0.306000)
fib_ruby(35)                    0.302000   0.000000   0.302000 ( 
0.302000)
fib_ruby(35)                    0.302000   0.000000   0.302000 ( 
0.302000)
fib_ruby(35)                    0.302000   0.000000   0.302000 ( 
0.302000)
fib_ruby(35)                    0.303000   0.000000   0.303000 ( 
0.303000)
fib_duby(35)                    0.007000   0.000000   0.007000 ( 
0.007000)
fib_duby(35)                    0.007000   0.000000   0.007000 ( 
0.007000)
fib_duby(35)                    0.007000   0.000000   0.007000 ( 
0.007000)
fib_duby(35)                    0.008000   0.000000   0.008000 ( 
0.008000)
fib_duby(35)                    0.009000   0.000000   0.009000 ( 
0.008000)

And yes, I appreciate that this looks a little silly since the Duby
code is basically the same as the Ruby code. But hey, that's how the
_inline thing works. I am interested in adding native support for Duby
(or for localized static typing) to JRuby in the future.

- Charlie

On Sun, Jan 17, 2010 at 4:04 PM, Charles Oliver Nutter
Posted by Charles Nutter (headius)
on 2010-01-17 23:54
(Received via mailing list)
Oops, those numbers were actually fib(30). Here's fib(35) numbers:

fib_ruby(35)                    3.192000   0.000000   3.192000 ( 
3.192000)
fib_ruby(35)                    2.992000   0.000000   2.992000 ( 
2.992000)
fib_ruby(35)                    3.009000   0.000000   3.009000 ( 
3.009000)
fib_ruby(35)                    3.001000   0.000000   3.001000 ( 
3.001000)
fib_ruby(35)                    2.988000   0.000000   2.988000 ( 
2.988000)
fib_duby(35)                    0.082000   0.000000   0.082000 ( 
0.081000)
fib_duby(35)                    0.077000   0.000000   0.077000 ( 
0.077000)
fib_duby(35)                    0.078000   0.000000   0.078000 ( 
0.078000)
fib_duby(35)                    0.078000   0.000000   0.078000 ( 
0.078000)
fib_duby(35)                    0.078000   0.000000   0.078000 ( 
0.078000)


On Sun, Jan 17, 2010 at 4:45 PM, Charles Oliver Nutter
Posted by Roger Pack (rogerdpack)
on 2010-01-19 18:26
> It is true that floats are approximations of real numbers but the issue 
> that most often trips people up is the conversion between the internal 
> binary representation and the external decimal representation.  More 
> specifically, by default Ruby doesn't show you the full precision 
> available in a float:
> 
>>> 1/3.0
> => 0.333333333333333
>>> "%0.60f" % (1/3.0)
> => "0.333333333333333314829616256247390992939472198486328125000000"


So currently ruby "doesn't show you the real floating point number"
which leads to surprising behavior, for example

>> (2.0 - 1.1) == 0.9
=> false

This has been changed in the latest dev trunk, to be more explicit 
(Float#to_s is totally explicit now).

>> 2.0 - 1.1
=> 0.8999999999999999
>> 0.9
=> 0.9
>> 2.0 - 1.1 == 0.9
=> false

Unfortunately this means that writing out to the screen used to be
>> puts 2.0 - 1.1
0.9

but is now
>> puts 2.0 - 1.1
0.8999999999999999

I have a proposal out to make Float#to_s be "the imprecise [old] way" 
and Float#inspect to be the new, precise way (split the 
functionality--it's currently the same).
Any feedback/preferences on this before I do anything in that regard?
-r
Posted by Benoit Daloze (Guest)
on 2010-01-19 21:21
(Received via mailing list)
2010/1/19 Roger Pack <rogerpack2005@gmail.com>

> I have a proposal out to make Float#to_s be "the imprecise [old] way"
> and Float#inspect to be the new, precise way (split the
> functionality--it's currently the same).
>

I surely agree !
This is the opinion I said on redmine, and 2 threads. I think most 
people
would agree, but there was apparently no discution to this precise 
opinion.
Maybe we should open one ? as a feature, an imporvement ?
Posted by Roger Pack (rogerdpack)
on 2010-01-19 21:39
Benoit Daloze wrote:
> 2010/1/19 Roger Pack <rogerpack2005@gmail.com>
> 
>> I have a proposal out to make Float#to_s be "the imprecise [old] way"
>> and Float#inspect to be the new, precise way (split the
>> functionality--it's currently the same).
>>
> 
> I surely agree !


You could add a +1 here:

http://redmine.ruby-lang.org/issues/show/2152

But, more directly, the fact that people would like it that way is more 
motivation for me to come up with a concrete patch for it and start 
pinging core :)

-r
Please log in before posting. Registration is free and takes only a minute.
Existing account (Switch to SSL-encrypted connection)
NEW: Do you have a Google/GoogleMail or Yahoo account? No registration required!
Log in with Google account | Log in with Yahoo account
No account? Register here.