Ruby performance

Hi,

I’m new to Ruby, and I’ve just tried to measure it’s performance
compared to Python, and got some interesting results.

The Ruby test script is:

#!/usr/bin/env ruby

s = “”
i = 0
while line = gets
s += line
i += 1
puts(i) if i % 1000 == 0
end

and the Python one:

#!/usr/bin/env python

import sys

s = “”
i = 0
for line in sys.stdin:
s += line
i += 1
if i % 1000 == 0: print i

I fed a 1MB-large file as input to each of this script. The strange
thing is that the Ruby version starts to output progress slower and
slower as the s string grows. The Python version went smoothly.

Looks like a memory management issue with Ruby. I wonder if this is
going to be significantly improved in subsequent releases of Ruby? I
was testing with Ruby 1.8.6, and Python 2.5.1. (I also tried Python
2.3 and it was significantly slower than 2.5)

As I said, I’m new to Ruby, but want to use it for a long term
project, and would like to know more about it’s performance specifics.

Thanks for any clarification in advance!

On Sat, 29 Sep 2007, Vasyl S. wrote:

i = 0
import sys
thing is that the Ruby version starts to output progress slower and
slower as the s string grows. The Python version went smoothly.

This is because “string” + “otherstring” allocates a new string,
“stringotherstring”.

Use << instead, and it will concatenate “otherstring” to the orignal
“string”.

s = “”
i = 0
while line = gets
s << line
i += 1
puts(i) if i % 1000 == 0
end

Performance should be a lot better with that version.

Kirk H.

On Sep 28, 2007, at 11:06 AM, [email protected] wrote:

#!/usr/bin/env python

s = “”
Kirk H.

That should be on a Ruby Optimization list somewhere!

On Sep 28, 7:06 pm, [email protected] wrote:

Use << instead, and it will concatenate “otherstring” to the orignal
“string”.

Kirk H.

Indeed, fast as hell!

Thank you, Kirk!

On 28 Sep 2007, at 19:21, John J. wrote:

The Ruby test script is:

i += 1
Use << instead, and it will concatenate “otherstring” to the
Performance should be a lot better with that version.

Kirk H.

That should be on a Ruby Optimization list somewhere!

Weird. The python version extremely slow compared to Kirk’s ruby
version. Does it suffer from the same allocation problems as the
initial ruby version?

/C

Christoffer Lernö wrote:

Weird. The python version extremely slow compared to Kirk’s ruby
version. Does it suffer from the same allocation problems as the
initial ruby version?
Yes, for a + b python allocates a new string of length(a) + length(b),
then a’s and b’s content are copied into the new string. There is only
an optimization if a or b is empty. a << b in ruby first reallocates
memory of length(b) to a’s then copies only b’s content to a. If b is
empty ruby does nothing. The more complex concatenation operation in
python is caused by python’s immutable strings.

On Sep 28, 8:59 am, Vasyl S. [email protected] wrote:

i = 0
import sys
slower as the s string grows. The Python version went smoothly.

Looks like a memory management issue with Ruby. I wonder if this is
going to be significantly improved in subsequent releases of Ruby? I
was testing with Ruby 1.8.6, and Python 2.5.1. (I also tried Python
2.3 and it was significantly slower than 2.5)

As I said, I’m new to Ruby, but want to use it for a long term
project, and would like to know more about it’s performance specifics.

Thanks for any clarification in advance!

Some things about optimizing Python that doesn’t work in Ruby except
to make the Ruby code run more slowly:

1). Read the entire contents of the file using a single operation.
2). Use LC (List Comprenhensions) to iterate over the entire file such
as to count or gather lines.
3). Use Psyco to dramatically boost the speed of the Python code to
near machine code speeds.
4). Use String translate function to process the contents of the file
in cases where this sort of thing is needed.

I was able to read a 20 MB file using Python code that ran 10x to 30x
faster than the fastest Ruby code that attempted to read the file into
an array of 64 bit values. In my case my code needed to process every
single character in the 20 MB file by setting the MSB of each
character and then write the file back out.

Hey all

I’ve been meaning to send out this email for a while.

In light of recent criticism and desperation for faster Ruby code,
what are some common practices (little things) that will make code
run faster? IE, if-then or case-when, etc.

So far all I have is:
<< instead of +

-------------------------------------------|
~ Ari

From now on, when giving examples, instead of ie use ff.

On Sep 28, 2:28 pm, Ruby M. [email protected] wrote:

#!/usr/bin/env python
I fed a 1MB-large file as input to each of this script. The strange

Thanks for any clarification in advance!

Some things about optimizing Python that doesn’t work in Ruby except
to make the Ruby code run more slowly:

1). Read the entire contents of the file using a single operation.
2). Use LC (List Comprenhensions) to iterate over the entire file such
as to count or gather lines.
3). Use Psyco to dramatically boost the speed of the Python code to
near machine code speeds.

Bull. Psyco isn’t even as fast as LuaJIT.

4). Use String translate function to process the contents of the file
in cases where this sort of thing is needed.

I was able to read a 20 MB file using Python code that ran 10x to 30x
faster than the fastest Ruby code that attempted to read the file into
an array of 64 bit values. In my case my code needed to process every
single character in the 20 MB file by setting the MSB of each
character and then write the file back out.

Since Psyco only had to execute one command, “translate”, in order to
convert the entire contents of the file, the program was very fast.
Most programs are much slower under Psyco.

On Sun, 2007-09-30 at 06:34 +0900, Ari B. wrote:

In light of recent criticism and desperation for faster Ruby code,
what are some common practices (little things) that will make code
run faster? IE, if-then or case-when, etc.

So far all I have is:
<< instead of +

The most general advice is to minimize the number of method calls you
make, and try to create fewer intermediate objects. Using << instead of

  • to accumulate strings satisfies the latter.

Another specific instance would be that it usually performs better to
do:

def foo(x)
x.y { yield }
end

than:

def foo(x, &block)
x.y &block
end

(Of course the latter form is still required if #y retains the block as
a Proc object, rather than calling it directly.)

-mental