Forum: Ruby ruby performance

Posted by anaray anaray (anaray)
on 2012-06-29 19:13
Hi,

I have snippet from my project, and this iteration is performing very
very badly compared to same implementation in Java

Java : Time Elapsed in milliseconds: 8109 ms.
Ruby : Time Elapsed in Seconds: 132.28125 sec.

It a huge difference, Am i missing something here, is there any
particular best practices.  I am trying to implement Ruby for a
particular project, where it needs to read/process huge arrays etc
 Please guide me.

input = "ababaa" * 10000
suffix_array = input.split(//)
suffix_array_len = suffix_array.size

for i in 1..suffix_array_len-1 do
  total=0
  for j in i..suffix_array_len-1 do

  end
end
Posted by Roger Pack (rogerdpack)
on 2012-06-29 19:36
maybe try jruby in --server mode?
Posted by Hans Mackowiak (hanmac)
on 2012-06-29 20:11
replace "for i in 1..suffix_array_len-1 do"
with "suffix_array.each_with_index do |s,i|"

it may faster depending what you want
Posted by Bartosz Dziewoński (matmarex)
on 2012-06-29 21:28
(Received via mailing list)
The MRI is, unfortunately, slow as balls, and there's not much you can
do. For computation-intensive tasks try JRuby or, well, a different
programmming language.

-- Matma Rex
Posted by Hans Mackowiak (hanmac)
on 2012-06-29 21:38
Bartosz Dziewoński wrote in post #1066673:
> The MRI is, unfortunately, slow as balls, and there's not much you can
> do. For computation-intensive tasks try JRuby or, well, a different
> programmming language.
>
> -- Matma Rex

if you write shitty code, its still shit in jruby too.
MRI 1.9 is faster then 1.8.


and i prefer MRI over JRuby because i need C-Gems
Posted by Bartosz Dziewoński (matmarex)
on 2012-06-29 22:01
(Received via mailing list)
2012/6/29 Hans Mackowiak <lists@ruby-forum.com>:
> Bartosz Dziewoński wrote in post #1066673:
>> The MRI is, unfortunately, slow as balls, and there's not much you can
>> do. For computation-intensive tasks try JRuby or, well, a different
>> programmming language.
>>
>> -- Matma Rex
>
> if you write shitty code, its still shit in jruby too.
> MRI 1.9 is faster then 1.8.

Yes, of course 1.9 is faster.

Still. Benchmark this for me, would you:

    1_000_000_000.times{|a| }

It takes 103 seconds to run on my machine. The following code in C++,
compiled with -O0, takes four seconds.

int main()
{
  for(int i=0; i<1000000000; i++){};
  return 0;
}

(I have verified that the generated assemly code actually runs the 
loop.)

MRI Ruby *is* two orders of magnitude slower than C and often
noticeably slower than other interpreted languages and no matter how
much you love it, you can't deny it's slow.


-- Matma Rex
Posted by Bartosz Dziewoński (matmarex)
on 2012-06-29 22:08
(Received via mailing list)
2012/6/29 Hans Mackowiak <lists@ruby-forum.com>:
> MRI 1.9 is faster then 1.8.

Yes, of course 1.9 is faster.

Still. Benchmark this for me, would you:

   1_000_000_000.times{|a| }

It takes 103 seconds to run on my machine. The following code in C++,
compiled with -O0, takes four seconds.

int main()
{
       for(int i=0; i<1000000000; i++){};
       return 0;
}

(I have verified that the generated assemly code actually runs the 
loop.)

MRI Ruby *is* two orders of magnitude slower than C and often
noticeably slower than other interpreted languages and no matter how
much you love it, you can't deny it's slow.


-- Matma Rex
Posted by Henry Maddocks (Guest)
on 2012-06-30 02:28
(Received via mailing list)
On 30/06/2012, at 5:13 AM, anaray anaray wrote:

> Hi,
>
> I have snippet from my project, and this iteration is performing very
> very badly compared to same implementation in Java

Maybe if you told us what that code was supposed to do then we could 
suggest a better approach.

Henry
Posted by Bob Hutchison (Guest)
on 2012-06-30 17:41
(Received via mailing list)
On 2012-06-29, at 1:13 PM, anaray anaray wrote:

> Hi,
>
> I have snippet from my project, and this iteration is performing very
> very badly compared to same implementation in Java
>
> Java : Time Elapsed in milliseconds: 8109 ms.
> Ruby : Time Elapsed in Seconds: 132.28125 sec.
>
> It a huge difference, Am i missing something here, is there any
> particular best practices.

Yes that's a huge difference, and what you're seeing is roughly what you 
have to expect. Count on 10-20 times slower unless you are heavily 
reliant on stuff Java can't do well (like start up times, or fall back 
on underlying C code as some gems do). Ruby can also be a big help if 
you have memory constraints.

You might consider JRuby because it sounds as though the JVM is 
something you're familiar with, but also because you can take advantage 
of multi-core architectures if your algorithms are parallelisable. But 
it's a trade off in that some gems won't work in JRuby.

You might also consider coding parts of your program in C. Yeah, I know. 
This is probably the *last* thing you want to do :-)

>  I am trying to implement Ruby for a
> particular project, where it needs to read/process huge arrays etc
> Please guide me.

In general there's not much more to say. But if you get into specifics, 
maybe. However, most of the suggestions we could make will apply to a 
Java version as well, so you aren't necessarily going to see a relative 
performance increase, maybe, if lucky an absolute performance increase.

>
> input = "ababaa" * 10000
> suffix_array = input.split(//)

Is that really what you mean?

> suffix_array_len = suffix_array.size
>
> for i in 1..suffix_array_len-1 do

Are you starting at 1 for a reason? The first index is 0.

I'd probably write this something like:

(0...l).each do | i |

The '0...l' is the same as 0..(l-1)

>  total=0
>  for j in i..suffix_array_len-1 do
>
>  end
> end


Cheers,
Bob

>
> --
> Posted via http://www.ruby-forum.com/.
>

----
Bob Hutchison
Recursive Design Inc.
http://www.recursive.ca/
weblog: http://xampl.com/so
Posted by Jan E. (jacques1)
on 2012-06-30 20:55
Hi,

If performance is what you're after, I think Ruby is simply the wrong 
language.

You can try different Ruby implementations, you can optimize your code a 
bit, but you'll never come close to the 8 seconds of Java.

I'm not even sure if you've actually adapted Ruby, because the code 
above rather looks like you're translating Java code into Ruby.
Posted by anaray anaray (anaray)
on 2012-07-01 07:02
Thanks Bob for your valuable input. I think then in this case, Ruby with
MRI is a wrong choice. This was just preliminary test, my work involves
lots String, array processing, which involves huge data-set.

Recently I tried Ruby, and really loved the language, so i thought of
using it in my next work.

I am going to try Jruby

Thanks
Posted by anaray anaray (anaray)
on 2012-07-01 07:05
Yes, I am Ruby newbie :)
Later after posting this, i tried the Ruby way of iteration, but it also 
didn't help much, I am well short of requirement

Thanks

Jan E. wrote in post #1066782:
> Hi,
>
> If performance is what you're after, I think Ruby is simply the wrong
> language.
>
> You can try different Ruby implementations, you can optimize your code a
> bit, but you'll never come close to the 8 seconds of Java.
>
> I'm not even sure if you've actually adapted Ruby, because the code
> above rather looks like you're translating Java code into Ruby.
Posted by Andreas S. (andreas)
on 2012-07-01 13:31
Bartosz Dziewoński wrote in post #1066681:
> Still. Benchmark this for me, would you:
>
>    1_000_000_000.times{|a| }
>
> It takes 103 seconds to run on my machine. The following code in C++,
> compiled with -O0, takes four seconds.
>
> int main()
> {
>        for(int i=0; i<1000000000; i++){};
>        return 0;
> }

A loop with no payload is a completely useless benchmarking scenario.
Posted by Bartosz Dziewoński (matmarex)
on 2012-07-01 13:43
(Received via mailing list)
2012/7/1 Andreas S. <lists@ruby-forum.com>:
>
> A loop with no payload is a completely useless benchmarking scenario.

You are wrong. It's the same as benchmarking a loop with a payload,
without benchmarking the payload. You could also consider the payload
to be integer incrementation.

-- Matma Rex
Posted by Bob Hutchison (Guest)
on 2012-07-01 18:36
(Received via mailing list)
On 2012-07-01, at 1:02 AM, anaray anaray wrote:

> Thanks Bob for your valuable input. I think then in this case, Ruby with
> MRI is a wrong choice. This was just preliminary test, my work involves
> lots String, array processing, which involves huge data-set.
>
> Recently I tried Ruby, and really loved the language, so i thought of
> using it in my next work.
>
> I am going to try Jruby

If JRuby turns out to be fast enough for your requirements, I'd suggest 
developing as much as possible using MRI then moving it to JRuby. It'll 
be more pleasant. If you use gems make sure they work in JRuby, some 
won't.

Good Luck!

Cheers,
Bob

>
> Thanks
>
> --
> Posted via http://www.ruby-forum.com/.
>

----
Bob Hutchison
Recursive Design Inc.
http://www.recursive.ca/
weblog: http://xampl.com/so
Posted by Robert Klemme (robert_k78)
on 2012-07-02 08:34
(Received via mailing list)
On Sun, Jul 1, 2012 at 7:02 AM, anaray anaray <lists@ruby-forum.com> 
wrote:
> Thanks Bob for your valuable input. I think then in this case, Ruby with
> MRI is a wrong choice. This was just preliminary test, my work involves
> lots String, array processing, which involves huge data-set.

Can you be more specific about your use case?  I am asking because
there might actually be other factors which dominate runtime for the
real code you want to execute.  For example, if your data set is huge
and it needs to be read from some disk then there is a good chance
that your application is IO bound and hence looping performance does
not matter that much.  Or you can work with different data structures
to get better results.  Or you implement your core data structures in
a C extension and use Ruby for the business logic.

Kind regards

robert
Posted by Robert Klemme (robert_k78)
on 2012-07-02 08:40
(Received via mailing list)
On Sun, Jul 1, 2012 at 1:42 PM, Bartosz Dziewoński <matma.rex@gmail.com> 
wrote:
> 2012/7/1 Andreas S. <lists@ruby-forum.com>:
>>
>> A loop with no payload is a completely useless benchmarking scenario.
>
> You are wrong. It's the same as benchmarking a loop with a payload,
> without benchmarking the payload. You could also consider the payload
> to be integer incrementation.

I am going to place myself right between your chairs.  I think both of
you have it wrong. :-))

A loop without payload is not completely useless - but it only tells
you about a very artificial test case.  If the payload is more
expensive than the looping and both languages have less differences
there (e.g. because logic is IO bound) then the looping test does not
really help much for the real case.  Similarly if the real code does
not involve looping at all because other algorithms are chosen. :-)

Kind regards

robert
Posted by Dan Connelly (djconnel)
on 2012-07-02 22:37
Here's my contribution:

Ruby:
i = 0
10_000_000.times { i += 1 }
puts i

Perl:
for my $n ( 0 .. 1e7 - 1) {
  $i ++
}
print "$i\n"

Result:
Ruby: 2.970 seconds
Perl: 0.682 seconds

Wow...
Posted by Bartosz Dziewoński (matmarex)
on 2012-07-02 22:54
(Received via mailing list)
2012/7/2 Dan Connelly <lists@ruby-forum.com>:
> }
> print "$i\n"
>
> Result:
> Ruby: 2.970 seconds
> Perl: 0.682 seconds
>
> Wow...

The problem is that with every iteration inside #times, Ruby yield the
current "index" to the block - even if the block doesn't use this
variable at all.

And calling the block is no small operation... start here[1] (this is
the #times method, as defined in C) and see how deep the rabbit hole
goes.

[1] http://rxr.whitequark.org/mri/source/numeric.c#3441

-- Matma Rex
Posted by Dan Connelly (djconnel)
on 2012-07-03 02:06
I should provide version numbers... the RHEL on work machines is pretty 
backwards.

Ruby: 1.8.1
Perl: 5.8.8

I ran on my personal Macbook Air with Ruby 1.9.3 and Perl 5.10, and Perl 
was only 2x faster.  So it seems Ruby 1.9 is indeed substantially faster 
(assuming Perl is fairly stable).

Curious when I did the loop in C how huge the speed increase was with 
-O1 versus -O0...

#include <stdio.h>
int main(void) {
  int i = 0;
  while (++i < 10000000);
  printf("i = %d", i);
  return 0;
}

but that's a bit off-topic...
Posted by Peter Zotov (Guest)
on 2012-07-03 02:27
(Received via mailing list)
Dan Connelly писал 03.07.2012 04:06:
> I should provide version numbers... the RHEL on work machines is
> pretty
> backwards.
>
> Ruby: 1.8.1
> Perl: 5.8.8

1.8.1?!! That's just ancient. I doubt it's compatible with any recent
gems.
It seems that you need at least 1.8.6 even for the most compatible of
them.

> #include <stdio.h>
> int main(void) {
>   int i = 0;
>   while (++i < 10000000);
>   printf("i = %d", i);
>   return 0;
> }
>
> but that's a bit off-topic...

I don't know your precise GCC version so I won't comment on any
particular
optimization details my GCC version does, but I'd advise you to read
the
relevant assembly. The change could be anything from binding `i' to
a register to pre-computing the entire loop (yeah, recent GCC can do
that.)
Posted by botp (Guest)
on 2012-07-03 03:36
(Received via mailing list)
On Tue, Jul 3, 2012 at 8:06 AM, Dan Connelly <lists@ruby-forum.com> 
wrote:
> I ran on my personal Macbook Air with Ruby 1.9.3 and Perl 5.10, and Perl
> was only 2x faster.

cmon, with objects, speed becomes relative...
those who wants just speed can leave ruby...


$ for x in "ruby ruby1.rb" "ruby ruby2.rb" "perl perl1.pl"; do echo
$x;  time $x;echo ; done

ruby ruby1.rb
10000000

real  0m0.875s
user  0m0.930s
sys  0m0.040s

ruby ruby2.rb
10000001

real  0m0.312s
user  0m0.340s
sys  0m0.020s

perl perl1.pl

real  0m0.518s
user  0m0.520s
sys  0m0.000s

botp@u:~
$ for x in "ruby1.rb" "ruby2.rb" "perl1.pl"; do echo \#$x;  cat $x;echo 
; done

#ruby1.rb
i = 0
10_000_000.times { i += 1 }
puts i




#ruby2.rb
i=0
while i <= 10_000_000
  i += 1
end

puts i

#perl1.pl
#Perl:
for my $n ( 0 .. 1e7 - 1) {
   $i ++
}
Posted by Dan Connelly (djconnel)
on 2012-07-03 05:19
botp wrote in post #1067101:
> #ruby2.rb
> i=0
> while i <= 10_000_000
>   i += 1
> end
>
> puts i
>
> #perl1.pl
> #Perl:
> for my $n ( 0 .. 1e7 - 1) {
>    $i ++
> }

This is a slight cheat, since my examples had addition independent of 
loop control.  But nevertheless if I do the following:

i, j = 0,0
while j < 10_000_000
  i+=1
  j+=1
end
puts "i = #{i}"

Then the Perl and Ruby are very close in speed (versions 5.10, 1.9), 
which was non-obvious to me, since on the face of it,

10_000_000.times { i += 1 }

seems cleaner (seems the details are in how the block is invoked in each 
case).

It makes me wonder if there's room for a more efficient looping 
construct, since a while-loop is clumsy.  But perhaps performance is an 
insufficient reason to justify language changes.
Posted by Lars Haugseth (Guest)
on 2012-07-03 10:47
(Received via mailing list)
On 07/03/2012 03:34 AM, botp wrote:
>
> $ for x in "ruby ruby1.rb" "ruby ruby2.rb" "perl perl1.pl"; do echo
> $x;  time $x;echo ; done

Note that the overhead for starting up and exiting the interpreter has
an impact on your results when you benchmark like that.

  $ time for i in {0..100}; do ruby -e ''; done

  real  0m2.395s
  user  0m1.600s
  sys  0m0.628s

  $ time for i in {0..100}; do perl -e ''; done

  real  0m0.526s
  user  0m0.176s
  sys  0m0.300s

(Ruby 1.9.3, Perl 5.14.2.)
Posted by Ryan Davis (Guest)
on 2012-07-03 12:30
(Received via mailing list)
On Jul 3, 2012, at 01:47 , Lars Haugseth wrote:

> user  0m1.600s
> sys  0m0.628s
>
> $ time for i in {0..100}; do perl -e ''; done
>
> real  0m0.526s
> user  0m0.176s
> sys  0m0.300s
>
> (Ruby 1.9.3, Perl 5.14.2.)

Oh my god... are you guys still mentarbating over this crap?

First off... compare apples to apples. ruby 1.9 loads rubygems by 
default.

4605 % time for i in {0..100}; do ruby19 -e ''; done

real  0m3.089s
user  0m2.066s
sys  0m0.663s

4606 % time for i in {0..100}; do ruby19 --disable-gems -e ''; done

real  0m1.414s
user  0m0.771s
sys  0m0.438s

Second. Be glad you're not running python or anything on the jvm 
(compare `time gem list` vs `time jgem list`):

4604 % time for i in {0..100}; do python -c ''; done

real  0m14.186s
user  0m11.212s
sys  0m1.717s

Third... WHO CARES?!?!? This is all bullshit comparisons for bullshit 
reasons. You don't use ruby for speed of runtime. You use it because 
you'll be done AND have profiled and optimized the bottlenecks before 
they're done. You use it because it's a great language to develop in. 
You use it to GET STUFF DONE and AVOID THESE THREADS.
Posted by Jan E. (jacques1)
on 2012-07-03 13:32
Ryan Davis wrote in post #1067161:
> Third... WHO CARES?!?!? This is all bullshit comparisons for bullshit
> reasons. You don't use ruby for speed of runtime. You use it because
> you'll be done AND have profiled and optimized the bottlenecks before
> they're done. You use it because it's a great language to develop in.
> You use it to GET STUFF DONE and AVOID THESE THREADS.

I agree, and I think the whole benchmarking has nothing to do with the 
original topic.

The OP was talking about "huge arrays", so I rather think *this* is the 
bottleneck. Using a specialized language for data processing (maybe SQL) 
will probably make much more sense than trying to make Ruby some 
milliseconds faster.

I'm no expert, but processing "huge arrays" in Ruby or even Java just 
sounds wrong to me.
Posted by Dan Connelly (djconnel)
on 2012-07-03 14:41
Jan E. wrote in post #1067169:
> Ryan Davis wrote in post #1067161:
>> Third... WHO CARES?!?!? This is all bullshit comparisons for bullshit
>> reasons.


This sounds like a religious discussion.

I write code which does non-trivial calculations I'd prefer to run 
faster.  I also want it to be easier to run and maintain.  If there was 
really a factor of two difference between Ruby and Perl, that would be 
an issue: it would bias me towards Perl.  And if one type of loop were 
really much faster than another, I'd tend to want to use that loop.

It's not speed over everything else.  But speed is a good thing to know 
about.  Engineering decisions are all about trade-offs.
Posted by Andreas S. (andreas)
on 2012-07-03 15:40
Dan Connelly wrote in post #1067093:
> Curious when I did the loop in C how huge the speed increase was with
> -O1 versus -O0...

No surprise there; the loop doesn't do anything, so the optimizer can 
reduce the program to "return 0". And that's what is wrong with a 
benchmark like this. Rule of thumb, if you have to disable the optimizer 
to get a result, your benchmark is useless.
Posted by Chad Perrin (Guest)
on 2012-07-03 19:12
(Received via mailing list)
On Tue, Jul 03, 2012 at 12:19:52PM +0900, Dan Connelly wrote:
> puts "i = #{i}"
Why does j even exist in this?

    i = 0
    while j < 10_000_000
      i += 1
    end
    puts "i = #{i}"

Is the j there in some attempt to simulate a case where the loop
increment variable is not redundant with the operations taking place
within the loop?
Posted by Dan Connelly (djconnel)
on 2012-07-03 19:36
Chad Perrin wrote in post #1067210:

> Is the j there in some attempt to simulate a case where the loop
> increment variable is not redundant with the operations taking place
> within the loop?

Yes -- I wanted to "do something" in the loop, so I incremented a 
counter.  I admit it's a crappy benchmark; I wanted something simple but 
not completely trivial.
Posted by Ryan Davis (Guest)
on 2012-07-03 21:21
(Received via mailing list)
On Jul 3, 2012, at 05:41 , Dan Connelly wrote:

> really a factor of two difference between Ruby and Perl, that would be
> an issue: it would bias me towards Perl.  And if one type of loop were
> really much faster than another, I'd tend to want to use that loop.
>
> It's not speed over everything else.  But speed is a good thing to know
> about.  Engineering decisions are all about trade-offs.

If you pretend that this thread is full of scientific study or 
engineering decisions, then it is you who are involved in the religious 
discussion. The code/numbers being bandied about here are neither 
rigorous, nor informative, nor pertinent to the OP's actual needs.
Posted by Dan Connelly (djconnel)
on 2012-07-03 23:48
Whoa -- excuse me.  I try to run some quick numbers out of curiosity and 
I get slammed.  I didn't realize this forum met the standards of 
peer-review literature in its content.
Posted by Michael Shigorin (Guest)
on 2012-07-04 09:38
(Received via mailing list)
On Wed, Jul 04, 2012 at 06:48:14AM +0900, Dan Connelly wrote:
> Whoa -- excuse me.

Hey, excuse them.
Please log in before posting. Registration is free and takes only a minute.
Existing account (Switch to SSL-encrypted connection)
NEW: Do you have a Google/GoogleMail or Yahoo account? No registration required!
Log in with Google account | Log in with Yahoo account
No account? Register here.