For performance, write it in C - Part 2, comparing C, Ruby a

This is the follow up to my “Write it in C post” and is intended to
report the timings for the Java implementation that I said I would write
for Charles O Nutter and the Ruby version by Simon Kroeger. First let us
deal with the Ruby version.

The program differs from the Perl and C versions in that the various
values it requires are not precomputed. Simon’s program is completely
self contained.

[Latin]$ time ruby latin.rb 5 > r5

real 0m35.793s
user 0m32.081s
sys 0m0.843s

This quite clearly pisses all over the Perl version, and yes the results
were correct. Both faster than the Perl version and considerably less
code, a testament to the power and expressiveness of Ruby.

Now the Java version. I will be honest here, I might be paid to program
in Java but it hasn’t been my language of choice since around 1992. I
find it gets in my way and today it found yet another way to do it.

A straight translation like the C version worked fine for a 4 x 4 grid
but when I got to the 5 x 5 grid I got the following error ‘code too
large’. Yes Java has hard coded limits as to the allowed size of various
data structures within class files and the Compared array of 120 x 120
boolean values could not be initialised with the following code:

private static boolean[][] Compared = {
{false, false, …

{true, true, …
};

I had to have a whole load of ‘Compared[0][44] = true;’ and the like to
get the data in. This got the 5 x 5 grid to run but the 6 x 6 grid blew
up even that. Java has a 64Kb limit for various structures in the class
file (see
Oracle Java Technologies | Oracle).
The last time that I had to work round such mind numbingly arbitrary
limits was when I was programming Quick Basic. Now the timings.

[Latin]$ time ./j_version.sh 5 > j5

real 0m29.553s
user 0m13.813s
sys 0m10.745s

Sorry Java fans but “as fast as C” or “faster than C” it is not. It’s
only a bit faster than Ruby despite having much more resources being
dedicated to speeding it up.

The really odd thing here is that Java should actually be much faster
than this. I did manage to get the 4 x 4 grid to be written with the
same initialisation method as the C version and the timings (admittedly
on a much smaller problem) were much closer to the C version for the
same 4 x 4 grid. The solution just didn’t scale because of the 64Kb
limit in the class files, which is probably not going to be change any
time in the near future.

In the interest of fairness I also looked at the timings of just the
execution of the C and Java version so that the performance of the
compilers were not impacting the times. So here is the C and Java
versions without the precomuting phase and without the compiling.

[Latin]$ time ./latin > /dev/null 2>&1

real 0m1.961s
user 0m1.680s
sys 0m0.051s

[Latin]$ time java Latin > /dev/null 2>&1

real 0m15.483s
user 0m9.641s
sys 0m4.280s

There you have it, C is still faster by an order of magnitude.
Performance is yours for the asking, but it comes at a price - you have
to write it in C. Ease of development also comes at a price, you don’t
get the same performance as C. Of course if you have a fear of C this
does show that you can go some of the way by converting to Java, if that
is fast enough for you then well and good but know this, C is faster.

Peter H. wrote:

snip

Now the Java version. I will be honest here, I might be paid to program
in Java but it hasn’t been my language of choice since around 1992. I
find it gets in my way and today it found yet another way to do it.

snip

does show that you can go some of the way by converting to Java, if that
is fast enough for you then well and good but know this, C is faster.

When people want ‘speed’ they care about how fast the code runs, not
JVM startup time…

Your benchmarks are utterly irrelevant, sorry.

Isak

Isak H. wrote:

When people want ‘speed’ they care about how fast the code runs, not
JVM startup time…

Your benchmarks are utterly irrelevant, sorry.

Isak

Interesting, so just how to you run a Java program without the JVM
start-up time?

And if you can’t run a Java program without the JVM start-up then your
point is
what exactly?

Peter H. wrote:

Interesting, so just how to you run a Java program without the JVM
start-up time?

And if you can’t run a Java program without the JVM start-up then your
point is what exactly?

It’s like measuring database performance and including the startup time
of the database server. Sure, mysql startup is faster then oracles, but
does it make sense? I don’t know…

Regards,
Roland

On 7/28/06, Peter H. [email protected] wrote:

Interesting, so just how to you run a Java program without the JVM start-up time?

You can time it inside java by fetching the system clock before and
after.

And if you can’t run a Java program without the JVM start-up then your point is
what exactly?

It makes sense to do that for long running applications but in this
case it doesn’t. If what you really wanted to do was calculate this
latin squares thing then the startup time matters. Are there any JVMs
around that keep a shared daemon running for all processes to share so
as to avoid some of the startup time?

Pedro.

From: Isak H. [mailto:[email protected]]
Sent: Friday, July 28, 2006 1:15 PM

When people want ‘speed’ they care about how fast the code runs, not
JVM startup time…

Of course each Benchmark is to be taken with a lot of caution, but I
doubt the JVM takes more than 13 seconds to start, even on a slow
machine.

cheers

Simon

I’m not too sure that your analogy holds. It’s not like I included the
time to turn on my computer, log in, get the command prompt and type in
the commands. Do you really think that it is unreasonable to include the
start up for the JVM when that what you have to do to run the program?

That would only get me the elapsed time of the execution which does not
reflect the actual time taken to run the program. Some higher priority
task may switch in and suspend the Java program before it completes and
thus give it a much worse timing. And then you can bet they would be
pointing this out and saying that the timing were ‘utterly irrelevant’.

For most people they have to start up the JVM to run a Java program so I
cannot in all honesty see this as being ‘utterly irrelevant’. Hey the
executable created from the C source has to be loaded into memory before
it is run so this means that the timings for the C version are also
‘utterly irrelevant’.

It looks like a troll. It posts like a troll. It is a troll.

On 7/28/06, Roland S. [email protected] wrote:

of the database server. Sure, mysql startup is faster then oracles, but
does it make sense? I don’t know…

And mysql is as fast as Oracle, even faster unless Oracle is very, very
well tuned.

But of course you do have your point.

Nevertheless startup times matters for some programms, so Peters post
was
not useless, I feel he gets critizised a little bit too much for his
post.

hang in there Peter
Cheers
Robert

Regards,

Roland


Deux choses sont infinies : l’univers et la bêtise humaine ; en ce qui
concerne l’univers, je n’en ai pas acquis la certitude absolue.

  • Albert Einstein

Peter H. wrote:

I’m not too sure that your analogy holds. It’s not like I included the
time to turn on my computer, log in, get the command prompt and type in
the commands. Do you really think that it is unreasonable to include the
start up for the JVM when that what you have to do to run the program?

But startup time for the computer and so on is the same regardless of
executing a java or c or ruby program next… Sorry, my intention was
not to criticize your statements and i think Robert is right here when
he says:

I feel he gets critizised a little bit too much for his post.

But i think there are two valid viewpoints.
I’m a professional java developer, so at 8.00am i’m starting the Eclipse
IDE and uses it until 6.00pm when i turn my computer of. In this case
the startup-time for the java-vm is not of interest when the overall
performance of Eclipse “is good enough”.

The second scenario is yours: A relative small program where the
startup-time of the vm is significant in relation to the overall time
the programm is running.

Regards,
Roland

Peter H. wrote:

Now the Java version. I will be honest here, I might be paid to program
in Java but it hasn’t been my language of choice since around 1992. I
find it gets in my way and today it found yet another way to do it.

Um, just a nitpick, but Java didn’t exist in 1992. Unless you count Oak.

On 7/28/06, Peter H. [email protected] wrote:

This is the follow up to my “Write it in C post” and is intended to
report the timings for the Java implementation that I said I would write
for Charles O Nutter and the Ruby version by Simon Kroeger. First let us
deal with the Ruby version.

Was the code for the Java and Ruby versions posted somewhere in
the other thread? (I must admit, I tuned a lot of that thread out once
it
became a shouting match about the relative speeds of C and Java.)

Hmm. I left Uni in 1992 and it was around then that my, obviously flaky,
memory says I was reading the O’Reilly Java in a Nutshell.

Digs out the brown book. Oh yes it is dated 1996, what the hell was I
doing for four years?

Good catch.

pat eyler wrote:

Was the code for the Java and Ruby versions posted somewhere in
the other thread? (I must admit, I tuned a lot of that thread out
once it
became a shouting match about the relative speeds of C and Java.)

The Ruby version was posted in the previous thread by Simon. I didn’t
post the Java version because, code wise, it is pretty much a line for
line translation of the C version. But so you can see what the code was
like here is the 3 x 3 version. The 5 x 5 version is just too damn big
for a post, being as it is 5449 lines long!

public class Latin {
private static int WidthOfBoard = 3;

private static int NumberOfPermutations = 6;

private static String[] OutputStrings = {
“321”,
“231”,
“213”,
“312”,
“132”,
“123”
};

private static boolean[][] Compared = new boolean[6][6];

private static int work[] = { 0, 0, 0 };

private static void addARow(int row) {
if (row == WidthOfBoard) {
for (int x = 0; x < WidthOfBoard; x++) {
if (x == 0) {
System.out.print(OutputStrings[work[x]]);
} else {
System.out.print(":" + OutputStrings[work[x]]);
}
}
System.out.println();
} else {
for (int x = 0; x < NumberOfPermutations; x++) {
work[row] = x;

    boolean is_ok = true;
    if (row != 0) {
      for (int y = 0; y < row; y++) {
        if (Compared[work[row]][work[y]] != true) {
          is_ok = false;
          break;
        }
      }
    }
    if (is_ok == true) {
      addARow(row + 1);
    }
  }
}

}

public static void main(String[] args) {
// This nonsense is to get around the fact that Java will not allow
// me to initialise an array in the declaration.

Compared[0][2] = true;
Compared[0][4] = true;
Compared[1][3] = true;
Compared[1][5] = true;
Compared[2][0] = true;
Compared[2][4] = true;
Compared[3][1] = true;
Compared[3][5] = true;
Compared[4][0] = true;
Compared[4][2] = true;
Compared[5][1] = true;
Compared[5][3] = true;

addARow(0);

}
}

On 7/28/06, M. Edward (Ed) Borasky [email protected] wrote:

Peter H. wrote:
[snip]

Now here’s where I’m going to put on my asbestos suit. I think the
difficulty of C development is vastly exaggerated by the fans of
“dynamic/scripting/interpreted” languages! [snip]

So what is the source of “fear of C?”

Well, there’s a number of “shoot yourself in the foot” and “C
pitfalls” type books out there which can give you a few ideas. My
guess is that, often, new C programmers get tripped up regularly on
things like:

  • arrays vs. pointers, extern vs. static, and other possibly tricky
    spots in the language,
  • build issues, like dealing with cryptic makefiles and gcc args (ex.
    passing in -lfoo args in the right order),
  • discipline with conventions on memory management

But I agree with you that it’s not so bad if you use it for what it’s
good at. Maybe what’s happened is, folks have a bad taste in their
mouth from trying to use C to write end-user apps, when it’s really
best at lower-level libs, drivers, and number crunching.

—John

Peter H. wrote:

There you have it, C is still faster by an order of magnitude.
Performance is yours for the asking, but it comes at a price - you
have to write it in C. Ease of development also comes at a price, you
don’t get the same performance as C. Of course if you have a fear of C
this does show that you can go some of the way by converting to Java,
if that is fast enough for you then well and good but know this, C is
faster.
In my younger days, I did a lot of development in assembler languages,
and for many years my main high-level language was FORTRAN. Towards the
end of my FORTRAN days (about 1990) I was still dropping into assembler
for speed, even though the (FORTRAN) compilers were quite good by that
time. C compilers really sucked, especially for numerical applications.

Now here’s where I’m going to put on my asbestos suit. I think the
difficulty of C development is vastly exaggerated by the fans of
“dynamic/scripting/interpreted” languages! In addition, I think the
difficulty of assembler development is vastly exaggerated, except in
bizarre architectures. (Of course, x86 does border on bizarre, until you
get to 64-bit addressing). :slight_smile:

So what is the source of “fear of C?”

On Fri, Jul 28, 2006 at 09:47:34PM +0900, Robert D. wrote:

Nevertheless startup times matters for some programms, so Peters post was
not useless, I feel he gets critizised a little bit too much for his post.

. . . and timings without JVM startup time are at least as “useless”
since there are similarly “irrelevant” parts of the total time for
completion of C, Ruby, Perl, Python, PHP, and other-language programs
that might be benchmarked. Unless we’re going to eliminate all time
from all benchmarks that isn’t strictly related to execution, we’d
better just admit “defeat” on this one, and include JVM time (especially
since there’s no sane way to eliminate all time not strictly related to
execution in a “true” interpreted language – making any possible Ruby
benchmarks “irrelevant” and “useless” by that argument).

Man, I needed a good laugh today. Where to begin…

On 7/28/06, Peter H. [email protected] wrote:

This is the follow up to my “Write it in C post” and is intended to
report the timings for the Java implementation that I said I would write
for Charles O Nutter and the Ruby version by Simon Kroeger. First let us
deal with the Ruby version.

You start off right, but it’s quickly apparent you’re setting out to
prove
Java claims wrong. You’re starting off with a specific intent.

The program differs from the Perl and C versions in that the various

values it requires are not precomputed. Simon’s program is completely
self contained.

[Latin]$ time ruby latin.rb 5 > r5

real 0m35.793s
user 0m32.081s
sys 0m0.843s

Not bad, really, but not even as good as the bogus Java numbers below.

This quite clearly pisses all over the Perl version, and yes the results

were correct. Both faster than the Perl version and considerably less
code, a testament to the power and expressiveness of Ruby.

So then Java is obviously more powerful since the bogus numbers are
faster…you can’t draw one conclusion from Ruby numbers and another
conclusion from Java numbers. You’re serving the food before setting the
table.

Now the Java version. I will be honest here, I might be paid to program

in Java but it hasn’t been my language of choice since around 1992. I
find it gets in my way and today it found yet another way to do it.

A straight translation like the C version worked fine for a 4 x 4 grid
but when I got to the 5 x 5 grid I got the following error ‘code too
large’. Yes Java has hard coded limits as to the allowed size of various
data structures within class files and the Compared array of 120 x 120
boolean values could not be initialised with the following code:

As another posted, Java hasn’t been around since 1992, so I think
perhaps
you’re mistaken.

private static boolean[][] Compared = {

).
The last time that I had to work round such mind numbingly arbitrary
limits was when I was programming Quick Basic. Now the timings.

First off, I call Troll.

Second, you’re not a very good Java programmer if you didn’t know about
this
limit. Perhaps they didn’t teach you this in Java class in 1992? (a
response
troll, admittedly)

The limit is not arbitrary; it’s to allow the JVM to maintain certain
constraints over the memory used by incoming class definitions, since
they’re typically not garbage collected. It would not be advisable to
allow
loading an extremely large class definition into permanent memory space,
eating up the entirety of the heap. Put your gigantic data in a separate
file and load it at runtime.

[Latin]$ time ./j_version.sh 5 > j5

real 0m29.553s
user 0m13.813s
sys 0m10.745s

Sorry Java fans but “as fast as C” or “faster than C” it is not. It’s
only a bit faster than Ruby despite having much more resources being
dedicated to speeding it up.

Startup time is and always has been a concern with Java apps, which is
why
their area of choice is primarily long-running server-side applications
or
somewhat less-long-running desktop applications. For example, would you
benchmark the speed of Excel’s calculation algorithms from the time you
start it up until you’d entered the numbers in and told it to calculate?
To
do so would be absurd. If you want to put languages on a level playing
field
you must remove limitations that each incurs for different reasons than
the
others. I’d also remove any Ruby load/parse time before running any
benchmark, since that’s skewing numbers too. Are we benchmarking the
performance of the language implementation or benchmarking how fast we
can
load Ruby’s couple hundred k of executable data versus Java’s many
megabytes
of base platform code? Compare apples to apples, man, and just benchmark
the
algorithm.

The really odd thing here is that Java should actually be much faster

than this. I did manage to get the 4 x 4 grid to be written with the
same initialisation method as the C version and the timings (admittedly
on a much smaller problem) were much closer to the C version for the
same 4 x 4 grid. The solution just didn’t scale because of the 64Kb
limit in the class files, which is probably not going to be change any
time in the near future.

No, it’s not. You shouldn’t stuff data into your class files. Class
files
are for code.

In the interest of fairness I also looked at the timings of just the

execution of the C and Java version so that the performance of the
compilers were not impacting the times. So here is the C and Java
versions without the precomuting phase and without the compiling.

[Latin]$ time ./latin > /dev/null 2>&1

real 0m1.961s
user 0m1.680s
sys 0m0.051s

I’m actually surprised C wasn’t even faster here.

[Latin]$ time java Latin > /dev/null 2>&1

real 0m15.483s
user 0m9.641s
sys 0m4.280s

Some versions of Java have taken as much as 15 seconds to start up on
certain platforms, and the startup time on Linux is frequently slower
than
on other platforms. Java 5 on Windows takes perhaps a second to start
up
now, primarily because they do use a shared-memory cache of much of the
static data loaded at startup. Of course, it’s not a startup cost of
zero,
but people simply don’t use Java for command-line tools.

There you have it, C is still faster by an order of magnitude.

Performance is yours for the asking, but it comes at a price - you have
to write it in C. Ease of development also comes at a price, you don’t
get the same performance as C. Of course if you have a fear of C this
does show that you can go some of the way by converting to Java, if that
is fast enough for you then well and good but know this, C is faster.

I have no fear of C. I have fear of making C work everywhere, which I do
not
have to worry about with either Ruby or Java. I also have a fear of C
fanboys giving up on improving Ruby and always advising that people drop
to
C for their problems.

Kroeger, Simon (ext) wrote:

machine.
This may seem like a reasonable assumption, but it really isn’t that
simple. Read up on how the JVM/Hotspot works, it’s interesting stuff
really.

Isak

Chad P. wrote:

completion of C, Ruby, Perl, Python, PHP, and other-language programs
that might be benchmarked. Unless we’re going to eliminate all time
from all benchmarks that isn’t strictly related to execution, we’d
better just admit “defeat” on this one, and include JVM time (especially
since there’s no sane way to eliminate all time not strictly related to
execution in a “true” interpreted language – making any possible Ruby
benchmarks “irrelevant” and “useless” by that argument).

Uh, you start your interpreter/jvm/whatever, get everything started,
then time an operation. it’s certainly not “impossible”