Minor 1.8 <-> 1.9 performance comparison gotcha


#1

Excited and inspired by the Ruby 1.9.1 release, I wrote a blog post
the other day talking about how much faster Prawn’s examples and tests
run on Ruby 1.9 out of the box. Of course, this isn’t scientific
benchmarking but it seemed like a reasonable way to say “Across the
board, things run much faster”.

As it turns out, there was a fatal flaw in my (admittedly
unscientific) analysis. It turned out that my Ruby 1.8.6 was built
without any compile time optimization, but my Ruby 1.9 build had been
set to -O2 . The difference was HUGE and completely skewed my
results.

This is almost certainly a silly mistake that most people who have
experience comparing manually compiled software would not make, but
I’m sure it’ll trip up some folks who don’t know what to look for. So
the lesson I learned was: If you’re doing performance comparisons,
either to share with the community or for your own needs, be sure to
check your CFLAGS.

$ ruby -e “require ‘rbconfig’; puts Config::CONFIG[‘CFLAGS’]”
$ ruby19 -e “require ‘rbconfig’; puts Config::CONFIG[‘CFLAGS’]”

If the optimization levels don’t match, all bets are off :slight_smile:

You can see the before[0] and after[1] if you wish, otherwise, sorry
to spam those in the know.
Hopefully this quick warning will be helpful to those who might not
have considered it otherwise.

-greg

PS: Jamis B. helped me track this issue down, so if he’s reading:
Thanks!

[0]
http://blog.majesticseacreature.com/archives/2009.01/prawn_and_ruby19.html
[1]
http://blog.majesticseacreature.com/archives/2009.02/lies_and_statistics.html


#2

On Sun, Feb 1, 2009 at 5:29 PM, Gregory B.
removed_email_address@domain.invalid wrote:

results.

[0] http://blog.majesticseacreature.com/archives/2009.01/prawn_and_ruby19.html
[1] http://blog.majesticseacreature.com/archives/2009.02/lies_and_statistics.html

It’s been a while since I recompiled either of them, but as long as
you’re going to recompile:

  1. As far as I know, ‘-O3’ is safe for both 1.8.x and 1.9.x. I haven’t
    measured the performance improvement recently, but someone did a while
    back and it was at least measurable.

  2. You should also take advantage of the GCC ‘-march=’ and ‘-mtune=’
    options and specify exactly what kind of processor(s) you have. You
    may have to do a Google search of the GCC documentation on their web
    site to find the exact list of processor types, but it is usually
    worth it. Note that most Linux distros, with the happy exception of
    openSUSE, provide binaries compiled with only i386 processor features
    on 32-bit x86 machines. openSUSE compiles with i586 on 32-bit x86
    machines. I’m not even sure you can get enough RAM in an i386 to even
    run the 2.6 kernel, let alone a desktop or a Ruby interpreter. :slight_smile:


M. Edward (Ed) Borasky

I’ve never met a happy clam. In fact, most of them were pretty steamed.


#3

M. Edward (Ed) Borasky wrote:

  1. You should also take advantage of the GCC ‘-march=’ and ‘-mtune=’
    options and specify exactly what kind of processor(s) you have.

http://gcc.gnu.org/gcc-4.2/changes.html says:

-mtune=native and -march=native will produce code optimized for the
host architecture as detected using the cpuid instruction.

Has anyone tried this with ruby?


#4

M. Edward (Ed) Borasky wrote:

on my Linux Athlon64 X2:

cd
export CFLAGS=’-O3 -march=athlon64 -mtune=athlon64’
./configure
make
make test # be sure the compiler didn’t break something!!!
make benchmark # 1.9.x only!
sudo make install

What about

CFLAGS=’-O3 -march=native -mtune=native’

I was assuming that’s how the new feature in gcc 4.2 works (only got 4.1
here at the moment).


#5

On Sun, Feb 1, 2009 at 5:55 PM, Joel VanderWerf
removed_email_address@domain.invalid wrote:

Has anyone tried this with ruby?
I have done it, but not recently. It’s pretty easy to do. For example,
on my Linux Athlon64 X2:

cd
export CFLAGS=’-O3 -march=athlon64 -mtune=athlon64’
./configure
make
make test # be sure the compiler didn’t break something!!!
make benchmark # 1.9.x only!
sudo make install


M. Edward (Ed) Borasky

I’ve never met a happy clam. In fact, most of them were pretty steamed.


#6

Joel VanderWerf wrote:

What about

CFLAGS=’-O3 -march=native -mtune=native’

I was assuming that’s how the new feature in gcc 4.2 works (only got
4.1 here at the moment).

That should work. Actually, I need to read the manual – you don’t need
both “-march” and “-mtune” IIRC, and one of them is “deprecated”.

gcc --version
gcc (SUSE Linux) 4.3.2 [gcc-4_3-branch revision 141291]


M. Edward (Ed) Borasky

I’ve never met a happy clam. In fact, most of them were pretty steamed.


#7

In article
removed_email_address@domain.invalid,
M. Edward (Ed) Borasky removed_email_address@domain.invalid wrote:

  1. You should also take advantage of the GCC ‘-march=’ and ‘-mtune=’
    options and specify exactly what kind of processor(s) you have. You
    may have to do a Google search of the GCC documentation on their web

Last time I tried trunk of 1.9 with llvm-gcc-4.2/4.3, it didn’t work.

I’ll try again.


#8

On Mon, Feb 2, 2009 at 3:18 AM, M. Edward (Ed) Borasky
removed_email_address@domain.invalid wrote:

cd
export CFLAGS=’-O3 -march=athlon64 -mtune=athlon64’
./configure
make

I do:
export CFLAGS="-O3 -march=native -mtune=native"
./configure
make

and see

gcc -O3 -march=native -mtune=native -O2 -g -Wall -Wno-parentheses (…)

It is ok if my options goes before default options to gcc? I wonder
what gcc will take, O2 or O3?


Pozdrawiam

Rados³aw Bu³at
http://radarek.jogger.pl - mój blog


#9

M. Edward (Ed) Borasky wrote:

gcc --version

gcc (SUSE Linux) 4.3.2 [gcc-4_3-branch revision 141291]

OK … here’s the manual page for x86 and x86_64:

http://gcc.gnu.org/onlinedocs/gcc-4.3.3/gcc/i386-and-x86_002d64-Options.html#i386-and-x86_002d64-Options

Essentially, if you’re compiling and always running on the same system,
the magic code is “-march=native”. For older versions of GCC that don’t
have “native”, use “-march=”.

The difference between “-march” and “-mtune” is documented there as
well. Basically, “-march” code is guaranteed to run only on the specific
architecture variant called out in the flag.


M. Edward (Ed) Borasky

I’ve never met a happy clam. In fact, most of them were pretty steamed.


#10

Rados³aw Bu³at wrote:

make

and see

gcc -O3 -march=native -mtune=native -O2 -g -Wall -Wno-parentheses (…)

It is ok if my options goes before default options to gcc? I wonder
what gcc will take, O2 or O3?

Well … gcc will take the last option it sees, so it will use -O2 :(.
There is a work-around, though. After you unpack the 1.9.1 tarball,
there will be a file in the main directory called “configure.in”. Around
line 193 you’ll see a definition for “optflags”:

if test “$GCC” = yes; then
linker_flag=-Wl,
: ${optflags=-O2} ${warnflags="-Wall -Wno-parentheses"}
else
linker_flag=
fi

Change the “-O2” to “-O3”. Then do

autoconf
export CFLAGS="-march=native"
./configure
make

Then the compile lines should look like this

gcc -march=native -O3 -g -Wall -Wno-parentheses (…)

Make sure you look at the output from “make test” to verify that the
extra level of optimization didn’t cause bugs. In fact, you should
probably run “RubySpec” against the resulting Ruby interpreter as well.
:slight_smile:


M. Edward (Ed) Borasky

I’ve never met a happy clam. In fact, most of them were pretty steamed.