Forum: Ruby For performance, write it in C

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
6d9bf78ca49a017e9e3e6b0357b6c59e?d=identicon&s=25 Peter Hickman (Guest)
on 2006-07-26 10:50
(Received via mailing list)
Whenever the question of performance comes up with scripting languages
such as Ruby, Perl or Python there will be people whose response can be
summarised as "Write it in C". I am one such person. Some people take
offence at this and label us trolls or heretics of the true programming
language (take your pick).

I am assuming here that when people talk about performance they really
mean speed. Some will disagree but this is what I am talking about.

In this post I want to clear some things up and provide benchmarks as to
why you should take "Write it in C" seriously. Personally I like to
write my projects in Ruby or Perl or Python and then convert them to C
to get a performance boost. How much of a boost? Well here I will give
you some hard data and source code so that you can see just how much of
a boost C can give you.

The mini project in question is to generate all the valid Latin squares.
A Latin square is a grid of numbers (lets say a 9 x 9 grid) that has the
numbers 1 to 9 appear only once in each row and column. Sudoku grids are
a subset of Latin squares.

The approach taken is to create a list of all the permutations and then
build up a grid row by row checking that the newly added row does not
conflict with any of the previous rows. If the final row can be added
without problem the solution is printed and the search for the next one
starts. It is in essence depth first search. The first version of the
program that I wrote in Perl took 473 minutes to generate all the valid
5 x 5 Latin squares, version four of the program took 12 minutes and 51
seconds. The C version of the program took 5.5 seconds to produce
identical results. All run on the same hardware.

[Latin]$ time ./Latin1.pl 5 > x5

real    473m45.370s
user    248m59.752s
sys     2m54.598s

[Latin]$ time ./Latin4.pl 5 > x5

real    12m51.178s
user    12m14.066s
sys     0m7.101s

[Latin]$ time ./c_version.sh 5

real    0m5.519s
user    0m4.585s
sys     0m0.691s

This is what I mean when I say that coding in C will improve the
performance of your program. The improvement goes beyond percentages, it
is in orders of magnitude. I think that the effort is worth it. If a 5 x
5 grid with 120 permutations took 12 minutes in Perl, how long would a 6
* 6 grid with 720 permutations take? What unit of measure would you be
using for a 9 x 9 grid?

Size   Permutations
====   ============
   1              1
   2              2
   3              6
   4             24
   5            120
   6            720
   7           5040
   8          40320
   9         362880

Now lets look at first version of the code:

     1  #!/usr/bin/perl -w

     2  use strict;
     3  use warnings;

     4  use Algorithm::Permute;

     5  my $width_of_board = shift;

     6  my @permutations;

     7  my $p = new Algorithm::Permute( [ 1 .. $width_of_board ] );

     8  while ( my @res = $p->next ) {
     9      push @permutations, [@res];
    10  }
    11  my $number_of_permutations = scalar(@permutations);

    12  for ( my $x = 0 ; $x < $number_of_permutations ; $x++ ) {
    13      add_a_line($x);
    14  }

Lines 1 to 11 just build up a list of all the permutations using the
handy Algorithm::Permute module from CPAN. Lines 12 to 14 starts on the
first row of the solution by trying out all possible permutations for
the first row.

    15  sub add_a_line {
    16      my @lines = @_;

    17      my $size = scalar(@lines);

    18      my $ok = 1;
    19      for ( my $x = 0 ; $x < $size ; $x++ ) {
    20          for ( my $y = 0 ; $y < $size ; $y++ ) {
    21              if ( $x != $y ) {
    22                  $ok = 0 unless compare( $lines[$x], $lines[$y]
);
    23              }
    24          }
    25      }

    26      if ($ok) {
    27          if ( $size == $width_of_board ) {
    28              print join(':', map { p($_) } @lines) . "\n";
    29          }
    30          else {
    31              for ( my $x = 0 ; $x < $number_of_permutations ;
$x++ ) {
    32                  add_a_line( @lines, $x );
    33              }
    34          }
    35      }
    36  }

The add_a_line() function first checks that none of the lines so far
conflict (have the same digit in the same column), if it passes and the
number of lines equals the size of the board then the result is printed
and another solution is looked for. Failing that another line is added
and add_a_line() is called.

Here is the function that tells if two lines conflict.

    37  sub compare {
    38      my ( $a, $b ) = @_;

    39      my $ok = 1;

    40      my @aa = @{ $permutations[$a] };
    41      my @bb = @{ $permutations[$b] };

    42      for ( my $x = 0 ; $x < $width_of_board ; $x++ ) {
    43          $ok = 0 if $aa[$x] == $bb[$x];
    44      }

    45      return $ok == 1;
    46  }

The p() function is a little utility to convert a list into a string for
display.

    47  sub p {
    48      my ($x) = @_;

    49      my @a = @{ $permutations[$x] };
    50      my $y = join( '', @a );

    51      return $y;
    52  }

Well I have just exposed some pretty crap code to eternal ridicule on
the internet, but there you have it. The code is crap, even non Perl
programmers will be able to point out the deficenties with this code. It
works, even though a 5 x 5 grid took 473 minutes to run. Lets try and
salvage some pride and show version four and see how we managed to speed
things up.

     1  #!/usr/bin/perl -w

     2  use strict;
     3  use warnings;

     4  use Algorithm::Permute;

     5  my $width_of_board = shift;

     6  my @permutations;
     7  my @output;
     8  my %compared;

     9  my $p = new Algorithm::Permute( [ 1 .. $width_of_board ] );

    10  while ( my @res = $p->next ) {
    11      push @permutations, [@res];
    12      push @output, join( '', @res );
    13  }
    14  my $number_of_permutations = scalar(@permutations);

Lines 1 to 14 are doing pretty much what version one was doing except
that a new list, @output, is being built up to precalculate the output
strings and remove the need for the p() function. A minor speed up but
useful.

    15  for ( my $x = 0 ; $x < $number_of_permutations ; $x++ ) {
    16      for ( my $y = 0 ; $y < $number_of_permutations ; $y++ ) {
    17          my $ok = 1;

    18          my @aa = @{ $permutations[$x] };
    19          my @bb = @{ $permutations[$y] };

    20          for ( my $z = 0 ; $z < $width_of_board ; $z++ ) {
    21              if ( $aa[$z] == $bb[$z] ) {
    22                  $ok = 0;
    23                  last;
    24              }
    25          }

    26          if ( $ok == 1 ) {
    27              $compared{"$x:$y"} = 1;
    28          }
    29      }
    30  }

Lines 15 to 30 introduces new code to precalculate the comparisons and
feed the results into a hash. Lines 31 to 33 start the work in the same
way as version one.

    31  for ( my $x = 0 ; $x < $number_of_permutations ; $x++ ) {
    32      add_a_line($x);
    33  }

And now to the improved add_a_line() function. The code has been
improved to only check that the newly added line does not conflict with
any of the existsing lines rather than repeatedly comparing the existing
(valid) lines.

    34  sub add_a_line {
    35      my @lines = @_;

    36      my $size = scalar(@lines);

    37      my $ok = 1;

    38      if ( $size > 1 ) {
    39          for ( my $x = 0 ; $x < ( $size - 1 ) ; $x++ ) {
    40              unless ( defined $compared{ $lines[$x] .':'.
$lines[-1] } ) {
    41                  $ok = 0;
    42                  last;
    43              }
    44          }
    45      }

    46      if ($ok) {
    47          if ( $size == $width_of_board ) {
    48              print join( ':', map { $output[$_] } @lines ) .
"\n";
    49          }
    50          else {
    51              for ( my $x = 0 ; $x < $number_of_permutations ;
$x++ ) {
    52                  add_a_line( @lines, $x );
    53              }
    54          }
    55      }
    56  }

These changes took us down from 473 minutes to just 12. The elimination
of unnessessary comparisons in add_a_line() helped as did the
precalculation of those comparisons. There are lessons to be learnt
here, write decent code and cache repetetive comparisons. There are no
great tricks, just that bad code can cost you dearly and simple things
can bring big improvements. So with such a massive improvement how could
we make our code any faster?

Write it in C.

Having learnt the lessons developing the code in Perl I am not going to
start the whole thing over in C. Using version four I used the
precalculation phase of the Perl scripts to write out a C header file
with data structures that would be useful for the C program.

     1  #define WIDTH_OF_BOARD 5
     2  #define NUMBER_OF_PERMUTATIONS 120
     3  char *output_strings[] = {
     4      "54321",

   123      "12345",
   124  };
   125  bool compared[NUMBER_OF_PERMUTATIONS][NUMBER_OF_PERMUTATIONS] =
{
   126      {false, false, ...

   245      {false, false, ...
   246  };
   247  int work[WIDTH_OF_BOARD];

This then leaves the C code itself. Lines 1 to 7 includes a load of
useful stuff, infact it is also probably including some quite
unneccessary stuff too, I just cut and paste it from another project.

     1  #include <stdio.h>
     2  #include <stdlib.h>
     3  #include <stdbool.h>
     4  #include <err.h>
     5  #include <string.h>
     6  #include <unistd.h>
     7  #include <sys/types.h>

Line 8 is the header file that Perl precalculated.

     8  #include "Latin.h"

Now the meat. The code is pretty much the same as version four just
adapted to C. No special C tricks, no weird pointer stuff, an almost
line for line translation of the Perl code.

     9  void
    10  add_a_row(int row)
    11  {
    12      bool            is_ok;
    13      int             x,y;

    14      if (row == WIDTH_OF_BOARD) {
    15          for (x = 0; x < WIDTH_OF_BOARD; x++) {
    16              if (x == 0) {
    17                  printf("%s", output_strings[work[x]]);
    18              } else {
    19                  printf(":%s", output_strings[work[x]]);
    20              }
    21          }
    22          puts("");
    23      } else {
    24          for (x = 0; x < NUMBER_OF_PERMUTATIONS; x++) {
    25              work[row] = x;

    26              is_ok = true;
    27              if (row != 0) {
    28                  for( y = 0; y < row; y++ ) {
    29                      if(compared[work[row]][work[y]] == false) {
    30                          is_ok = false;
    31                          break;
    32                      }
    33                  }
    34              }
    35              if (is_ok == true) {
    36                  add_a_row(row + 1);
    37              }
    38          }
    39      }
    40  }

    41  int
    42  main(int argc, char *argv[])
    43  {
    44      add_a_row(0);
    45  }

And the C version ran in 5.5 seconds. In fact the 5.5 seconds includes
the Perl program that does all the precalculation to write the Latin.h
header file, the compiling of the C source and finally the running of
the program itself. So we have not cheated by doing some work outside
the timings.

Just think of it, 12 minutes down to 5.5 seconds without having to write
any incomprehensible C code. Because we all know that C code is
completely incomprehensible with it doing all that weird pointer stuff
all the time.

Now the Perl code could be improved, there are tricks that could be
pulled out of the bag to trim something off the 12 minutes. Perhaps
another language would be faster? But how far could you close the gap
from 12 minutes to 5.5 seconds?

Just to up the ante I added -fast -mcpu=7450 to the compiler (gcc
optimized for speed on an G4 Macintosh) and ran it again.

[Latin]$ time ./c_version.sh 5 > x5

real    0m3.986s
user    0m2.810s
sys     0m0.540s

Another 30% performance improvement without changing any code.

Lets review the languages we have used and their advantages. C is very
fast without any stupid tricks. C will give you better control over the
amount of memory you use (the Perl code eats up massive amounts of
memory in comparison, the 9 x 9 grid threw an out of memory error on my
1Gb machine).

It is much easier to develop in Perl. Any error message you get is
likely to at least give you a clue as to what the problem might be. C
programmers have to put up with the likes of 'Bus error' or
'Segmentation fault', which is why C programmers are grouches. Perl also
allows you to significantly improve your code without major rewrites.
There is a module called Memoize that can wrap a function and cache the
calls all by adding two extra lines to your code, the same is true for
most scripting languages.

So what am I recommending here, write all your programs in C? No. Write
all your programs in Perl? No. Write them in your favourite scripting
language to refine the code and then translate it into C if the
performance falls short of your requirements. Even if you intend to
write it in C all along hacking the code in Perl first allows you to
play with the algorithm without having to worry about memory allocation
and other such C style house keeping. Good code is good code in any
language.

If you really really want that performance boost then take the following
advice very seriously - "Write it in C".
Cb2b768a5e546b24052ea03334e43676?d=identicon&s=25 Dr Nic (nicwilliams)
on 2006-07-26 11:02
Do you have any preferred tutorials on wrapping Ruby around C libraries?
93d566cc26b230c553c197c4cd8ac6e4?d=identicon&s=25 Pit Capitain (Guest)
on 2006-07-26 11:23
(Received via mailing list)
Peter Hickman schrieb:
> (Example of Perl and C Code)

Peter, is there any chance you could test your program with Ruby Inline?

   http://rubyforge.org/projects/rubyinline

I'm on Windows, so I can't use Ruby Inline (+1 for MinGW btw :-)

Regards,
Pit
Ffcb418e17cac2873d611c2b8d8d891c?d=identicon&s=25 unknown (Guest)
on 2006-07-26 11:41
(Received via mailing list)
Peter Hickman gave a very good article about prototyping in a scripting
language, and then re-coding in c:

*snip*

> If you really really want that performance boost then take the following
> advice very seriously - "Write it in C".

I totally agree that with the current state of the art, this is the
right approach.

Maybe it doesn't need saying, but I'm going to... in the vasy majority
of applications, almost all of their run time is a tiny wee small
fraction of the code. This is the part that I would write in c (or c++).
The vast majority of the code (and it's not there just for fun, it's
still completely important to the application) will use a vanishingly
small fraction of processor time. This is the bit that I would probably
leave in Ruby.

People talk about the 80:20 principle, but in my experience it's much
more like 99:1 for applications. 99% of the code uses 1% of the run
time. 1% of the code consumes 99% of the run time. That could be the
signal processing and graphics heavy applications that I have
experienced though.

Thanks for the comparison, it was great. And thanks for the very nice
pre-generation of look up tables in perl idea. Nice.

Cheers,
  Benjohn
B31556e8d0ac87a409534d52095707d9?d=identicon&s=25 unknown (Guest)
on 2006-07-26 11:48
(Received via mailing list)
It might be interesting to see how Java fares too - another route
again.

Pete
98e5e1f8fb2da8322a032e3ff2d7e668?d=identicon&s=25 Tomasz Wegrzanowski (Guest)
on 2006-07-26 11:55
(Received via mailing list)
On 7/26/06, benjohn@fysh.org <benjohn@fysh.org> wrote:
> Peter Hickman gave a very good article about prototyping in a scripting
> language, and then re-coding in c:
>
> *snip*
>
> > If you really really want that performance boost then take the following
> > advice very seriously - "Write it in C".
>
> I totally agree that with the current state of the art, this is the
> right approach.

Sorry, I just couldn't resist - but maybe you should code Java instead -
http://kano.net/javabench/ ;-)
703fbc991fd63e0e1db54dca9ea31b53?d=identicon&s=25 Robert Dober (Guest)
on 2006-07-26 12:13
(Received via mailing list)
On 7/26/06, Tomasz Wegrzanowski <tomasz.wegrzanowski@gmail.com> wrote:
> >
> > I totally agree that with the current state of the art, this is the
> > right approach.
>
> Sorry, I just couldn't resist - but maybe you should code Java instead -
> http://kano.net/javabench/ ;-)
>
> Of course not, but it might be of intereset that there are other
programming languages out there
Ada, ObjectiveC, Scheme, OCAML, Huskell you name it.
That said I agree that one can write code beautifully in C (see GTK for
example).
I do not think that people are treated bad when they say

   "Hmmm, maybe you should know that this kind of performance is not
possible in Ruby or even slightly faster interpreted languages, and that
you
should consider writing part of it in C, it is not so difficult, have a
look
here or there"

as opposed to those who write

  "Write it in C if you want speed"

That kind of post never gets accepted well and I have the feeling that
when
the speed issue comes up there is a certain tendence to reply like that.
Otoh I agree that the speed issue comes up much too often, but I suggest
instead of
replying  "Write it in C if you want speed", one can ignore the post or
point to older posts.

Just my 0.02â?¬.

Cheers
Robert


--
Deux choses sont infinies : l'univers et la bêtise humaine ; en ce qui
concerne l'univers, je n'en ai pas acquis la certitude absolue.

- Albert Einstein
A63764f318f10379c8b51349b757cf4b?d=identicon&s=25 Jay Levitt (Guest)
on 2006-07-26 13:32
(Received via mailing list)
On Wed, 26 Jul 2006 17:47:13 +0900, Peter Hickman wrote:

> In this post I want to clear some things up and provide benchmarks as to
> why you should take "Write it in C" seriously.

This is a great post, and should at least be posted to a blog somewhere
so
the masses who don't know about USENET can still find it on Google!

Jay
6d9bf78ca49a017e9e3e6b0357b6c59e?d=identicon&s=25 Peter Hickman (Guest)
on 2006-07-26 13:47
(Received via mailing list)
Robert Dober wrote:
>
Tact has never been one of my strong points. Your phrasing was much
nicer I will agree. The things that has been bugging me with this whole
performance thing is that I am sure that many people do not realise just
how fast a program written in C can run. The people who seem to take
offence at being told to write their code in C seem to have no
significant experience in using C. What also seems to happen is that
after saying that performance is their number 1 absolute top priority
they start to back peddle saying how hard C code is to develop. Yes is
it harder to work with than a scripting language, it is all the rope you
need to hang yourself with but you can write some blindingly fast code
if you are prepared to put in the effort. Wasn't that what they said was
their number 1 absolute top priority?
6d9bf78ca49a017e9e3e6b0357b6c59e?d=identicon&s=25 Peter Hickman (Guest)
on 2006-07-26 13:51
(Received via mailing list)
Jay Levitt wrote:
> Jay
>
>
>
I may well put it in my web site, along with all the source code. Google
and Yahoo hit it enough.
37ee5fa90f5eaeef62553629382497f7?d=identicon&s=25 Leslie Viljoen (Guest)
on 2006-07-26 14:28
(Received via mailing list)
On 7/26/06, Peter Hickman <peter@semantico.com> wrote:
> Jay Levitt wrote:
> > On Wed, 26 Jul 2006 17:47:13 +0900, Peter Hickman wrote:
> >
> >
> >> In this post I want to clear some things up and provide benchmarks as to
> >> why you should take "Write it in C" seriously.

Something else to consider is the ease with which Ruby extensions can
be written in C. The first time I tried I has something running in 20
minutes.

Though if I was going to choose a (single) language for raw
performance I'd try to go with Pascal or Ada.


Les
481b8eedcc884289756246e12d1869c1?d=identicon&s=25 Francis Cianfrocca (blackhedd)
on 2006-07-26 14:56
(Received via mailing list)
One important thing about Ruby performance is that Ruby gets very
inefficient very quickly, as the size of the working set grows. This
manifests as an observed performance problem, but the tip-off is that
performance of a particular Ruby program is quite acceptable for small
data-set sizes (such as in your test fixtures), but falls off a cliff
with
production data-sets. Most of the times that I have found myself
rewriting
slow Ruby code in C have been to get a program that runs in far less
memory,
by manually laying out the data structures. This improves performance
not
only for the obvious reasons (less thrashing and less contention with
other
processes on a busy server), but also because you can "help" the
processor's
cache manager to get more hits by creating localities of reference and
by
stuffing more of your working set into a smaller number of cache lines.
This
matters all the more on multiprocessors, which are severely penalized by
cache misses, to the point that some naively-written programs can run
noticeably faster on a uniprocessor.
F1d37642fdaa1662ff46e4c65731e9ab?d=identicon&s=25 Charles O Nutter (Guest)
on 2006-07-26 15:59
(Received via mailing list)
I'll lob a couple of grenades and then duck for cover.

- Write it in C is as valid as write it in Java (as someone else
mentioned).
Java is at least as fast as C for most algorithms. On the JRuby project,
when we need to clone a C extension or library from Ruby, we typically
write
it first in Ruby (to get something running quickly) and then reimplement
it
in Java (for performance). The sad truth beyond the numbers and
quantitative
comparisons is that a dynamic typed language with no primitives is
almost
certainly going to be slower than staticly typed with with
everything-is-an-object. Toss out your favorite super-fast dyntyped
language
if you will...I'm speaking in generalities.
- Write it in C is also as dangerous (in the grand scheme of things) as
write it in Java, because it has the potential to tie you to a specific
platform. For better or for worse, there's a lot of Windows machines out
there, and practically none of them have compilers. That means anything
you
write in C will have to be pre-built for those platforms, or users are
SOL.
Java suffers from the same fate...a Java VM is required...but if you're
running JRuby you probably have already made that plunge. There's a
balance
to be struck between tying yourself to fast native code or accepting
slow
interpreted code. You pay a very specific price either way.

All this said, there's truth to the idea that we shouldn't *have* to
write
platform-level code to get reasonable performance, and every effort
should
be made to improve the speed of Ruby code as near as possible to that of
the
underlying platform code. On JRuby, we are working on various compiler
designs to turn Ruby code directly into Java bytecodes and methods,
cutting
out the overhead of node-by-node interpretation. Early returns show as
high
as 65% gains for some algorithms. We're also looking at modifications to
the
parsing process to structure the AST in ways that makes it easier to
interpret. We in particular have to compete with a vast world of Java
code
and Java coders, and although we'll never match the performance of raw
Java
we endeavor to get as close as possible.
4299e35bacef054df40583da2d51edea?d=identicon&s=25 James Gray (bbazzarrakk)
on 2006-07-26 16:05
(Received via mailing list)
On Jul 26, 2006, at 8:57 AM, Charles O Nutter wrote:

> - Write it in C is as valid as write it in Java (as someone else
> mentioned).
> Java is at least as fast as C for most algorithms.

I'm Java Certified and I've been hearing people say this for years,
but I just haven't experienced this myself.  You guys must develop on
much faster boxes than my MacBook Pro.  ;)

James Edward Gray II
6d9bf78ca49a017e9e3e6b0357b6c59e?d=identicon&s=25 Peter Hickman (Guest)
on 2006-07-26 16:14
(Received via mailing list)
Charles O Nutter wrote:
> I'll lob a couple of grenades and then duck for cover.
>
> - Write it in C is as valid as write it in Java (as someone else
> mentioned).
> Java is at least as fast as C for most algorithms.

As someone who is paid to program in Java I very seriously doubt this.
However I will write a Java version of the code and time it. It should
be interesting to say the least.

> All this said, there's truth to the idea that we shouldn't *have* to
> write
> platform-level code to get reasonable performance, and every effort
> should
> be made to improve the speed of Ruby code as near as possible to that
> of the
> underlying platform code.
We are talking about two different things here. I was talking about
performance as being the number 1 absolute top priority, you are talking
about 'reasonable performance'. As far as I am concerned for those
scripts that I don't convert to C Perl, Ruby and Python are fast enough.
Other people think that they are not, they seem to expect the sort of
performance C gives when they write things in a scripting language. I
think that they are barking.
87e9a89c53ccf984db792113471c2171?d=identicon&s=25 Kroeger, Simon (ext) (Guest)
on 2006-07-26 16:21
(Received via mailing list)
Hi Peter!

> Whenever the question of performance comes up with scripting
> languages
> such as Ruby, Perl or Python there will be people whose
> response can be
> summarised as "Write it in C". I am one such person. Some people take
> offence at this and label us trolls or heretics of the true
> programming
> language (take your pick).

The last (and only) time I called someone a troll for saying
'Write it C' it was in response to a rails related question.
Further the OP asked for configuration items and such, but maybe
that's a whole other storry. (and of course you can write
C Extensions for rails... yeah, yadda, yadda :) )

...snip 52 lines Perl, some hundred lines C ...

> sys     0m7.101s
>
> [Latin]$ time ./c_version.sh 5
>
> real    0m5.519s
> user    0m4.585s
> sys     0m0.691s

Just to show the beauty of ruby:
-----------------------------------------------------------
require 'rubygems'
require 'permutation'
require 'set'

$size = (ARGV.shift || 5).to_i

$perms = Permutation.new($size).map{|p| p.value}
$out = $perms.map{|p| p.map{|v| v+1}.join}
$filter = $perms.map do |p|
  s = SortedSet.new
  $perms.each_with_index do |o, i|
    o.each_with_index {|v, j| s.add(i) if p[j] == v}
  end && s.to_a
end

$latins = []
def search lines, possibs
  return $latins << lines if lines.size == $size
  possibs.each do |p|
    search lines + [p], (possibs -
$filter[p]).subtract(lines.last.to_i..p)
  end
end

search [], SortedSet[*(0...$perms.size)]

$latins.each do |latin|
  $perms.each do |perm|
    perm.each{|p| puts $out[latin[p]]}
    puts
  end
end
-----------------------------------------------------------
(does someone has a nicer/even faster version?)

would you please run that on your machine?
perhaps you have to do a "gem install permutation"
(no I don't think it's faster than your C code, but
it should beat the perl version)

> If you really really want that performance boost then take
> the following
> advice very seriously - "Write it in C".

Agreed, 100%, for those who want speed, speed and nothing
else there is hardly a better way.

thanks

Simon
F1d37642fdaa1662ff46e4c65731e9ab?d=identicon&s=25 Charles O Nutter (Guest)
on 2006-07-26 16:21
(Received via mailing list)
On 7/26/06, James Edward Gray II <james@grayproductions.net> wrote:
>
> On Jul 26, 2006, at 8:57 AM, Charles O Nutter wrote:
>
> > - Write it in C is as valid as write it in Java (as someone else
> > mentioned).
> > Java is at least as fast as C for most algorithms.
>
> I'm Java Certified and I've been hearing people say this for years,
> but I just haven't experienced this myself.  You guys must develop on
> much faster boxes than my MacBook Pro.  ;)


Well la-dee-dah! Seriously though, there's a volume of tests and
benchmarks
(for what they're worth) that show this to be true, and for
memory-intensive
situations Java almost always wins because lazy memory management allows
work to get done first, faster. Granted, Java's abstractions make it
easier
to write bad code...but that's true of every language, including C.

- Charles Oliver Nutter, CERTIFIED Java Developer
481b8eedcc884289756246e12d1869c1?d=identicon&s=25 Francis Cianfrocca (blackhedd)
on 2006-07-26 16:23
Peter Hickman wrote:
 > We are talking about two different things here. I was talking about
> performance as being the number 1 absolute top priority, you are talking
> about 'reasonable performance'. As far as I am concerned for those
> scripts that I don't convert to C Perl, Ruby and Python are fast enough.
> Other people think that they are not, they seem to expect the sort of
> performance C gives when they write things in a scripting language. I
> think that they are barking.

I quite agree with this. To Nutter's point, one can make one's own
choice between C and Java once the decision has been made to write
platform-level code. But most code, perhaps nearly all code, should stay
in dyntyped script, in order to optimize the development/runtime
cost-balance. I think you can get tremendous benefits from factoring
code cleverly enough to keep the native-code components as small as
possible. And this is often a nontrivial exercise because it depends on
a good understanding of where performance costs come from in any
particular program.

To the point about Java: as I mentioned upthread, working-set size is
often the key limiting factor in Ruby performance. On a large and busy
server (which is my target environment most of the time), Java can be a
very inappropriate choice for the same reason!
F1d37642fdaa1662ff46e4c65731e9ab?d=identicon&s=25 Charles O Nutter (Guest)
on 2006-07-26 16:27
(Received via mailing list)
On 7/26/06, Peter Hickman <peter@semantico.com> wrote:
> be interesting to say the least.
Doubt all you like.

> scripts that I don't convert to C Perl, Ruby and Python are fast enough.
> Other people think that they are not, they seem to expect the sort of
> performance C gives when they write things in a scripting language. I
> think that they are barking.


They may be hoping for the impossible, but that doesn't mean they
shouldn't
hope and that doesn't mean they shouldn't be frustrated when the stock
answer is "write it in C". The fact that Ruby and other dynamic
languages
are not as fast as compiled C is not the language's fault or the user's
fault...it's an implementation flaw. It certainly may be a flaw that
can't
be fixed, a problem impossible to solve, but there's nothing about a
language's design that should necessitate it being slower than any other
language. Perhaps we haven't found the right way to implement these
languages, or perhaps some of us have and others just aren't there yet.
Either way, it's not the language that's eventually the problem...it's
simply the distance from what the underlying platform wants to run
that's an
issue. C is, as they put it, closer to "bare metal", but only because C
is
little more than a set of macros on top of assembly code. If the
underlying
processor ran YARV bytecodes, I doubt Ruby performance would be a
concern.
4ae2c9001ba1540b255a3c03734a7b0c?d=identicon&s=25 Ryan McGovern (Guest)
on 2006-07-26 16:31
(Received via mailing list)
Charles O Nutter wrote:
> - Charles Oliver Nutter, CERTIFIED Java Developer
>
I dont doubt for simple applications and algorithms java is nearly as
fast as C if not equivalent.  Though for larger java projects such as
Eclipse, i've had a horrible time of it being slow and cumbersome on the
system, and Visual Studio will run fine and be far more responsive.
I dont really know why that is it could be as simple as some bad code in
the java gui layer that Eclipse is using.
Bb6ecee0238ef2461bef3416722b35c5?d=identicon&s=25 pat eyler (Guest)
on 2006-07-26 16:44
(Received via mailing list)
On 7/26/06, Jay Levitt <jay+news@jay.fm> wrote:
> On Wed, 26 Jul 2006 17:47:13 +0900, Peter Hickman wrote:
>
> > In this post I want to clear some things up and provide benchmarks as to
> > why you should take "Write it in C" seriously.
>
> This is a great post, and should at least be posted to a blog somewhere so
> the masses who don't know about USENET can still find it on Google!

well, I didn't post the original post (though I did link to it).  I
did post my take
on it.  At it's core:  write it in Ruby and if it's too slow, profile
it and rewrite the
slow parts (in C if need be).   Rewriting the whole app in C when Ruby
makes
cohabitating with C (or C++ or Objective C) so easy just seems
pointless.

My post is at
http://on-ruby.blogspot.com/2006/07/rubyinline-mak...
E1d641bfe4071a5413bac781f06d3fd1?d=identicon&s=25 Sean O'halpin (sean)
on 2006-07-26 17:01
(Received via mailing list)
On 7/26/06, Charles O Nutter <headius@headius.com> wrote:

> there's nothing about a
> language's design that should necessitate it being slower than any other
> language.

While I accept that you shouldn't confuse a language with its
implementation,
I find that a mildly surprising statement, especially since you said
in an earlier post:

> for memory-intensive situations Java almost always wins because lazy memory
> management allows work to get done first, faster

Garbage collection seems to me to be an integral part of Java's design.

Off the top of my head, I can think of some other design aspects that
have an effect on performance: method lookup in OO languages, scoping,
continuations, closures, static vs dynamic typing, type inference,
double dispatch, consing + car + cdr in Lisp, direct vs indirect
threading in Forth, etc. These are not just matters of implementation.
Each is a language design decision with a semantic effect which incurs
or avoids a computational cost, regardless of how it's actually
implemented. For example, Ruby has real closures, Python doesn't. I
don't see how you could ever reduce the cost of Ruby having closures
to zero - the memory alone is an implied overhead. Sure you can
optimize till the cows come home but different functionalities have
different costs and you can't ever avoid that.

Regards,
Sean
F690ec04b0501b74b033fc64ff4f682b?d=identicon&s=25 Dean Wampler (Guest)
on 2006-07-26 17:04
(Received via mailing list)
On 7/26/06, benjohn@fysh.org <benjohn@fysh.org> wrote:
> ...
>
> People talk about the 80:20 principle, but in my experience it's much
> more like 99:1 for applications. 99% of the code uses 1% of the run
> time. 1% of the code consumes 99% of the run time. That could be the
> signal processing and graphics heavy applications that I have
> experienced though.
> ...

This is the "value proposition" of the "Hot Spot" technology in the
Java Virtual Machine. On the fly, it looks for byte code sections that
get executed repeatedly and it then compiles them to object code,
thereby doing runtime optimization. This allows many Java server
processes to run with near-native speeds. When Ruby runs on a virtual
machine, planned for version 2, then Ruby can do that too. The JRuby
project will effectively accomplish the same goal.
94f20d72bc01e249b1a2ecba9253c571?d=identicon&s=25 Pedro Côrte-Real (Guest)
on 2006-07-26 17:11
(Received via mailing list)
On 7/26/06, Sean O'Halpin <sean.ohalpin@gmail.com> wrote:
> optimize till the cows come home but different functionalities have
> different costs and you can't ever avoid that.

In theory if two programs in two different languages produce the same
exact results the perfect compilers for each of the languages would
end up producing the same code. In theory practice is the same as
theory but in practice it isn't.

Cheers,

Pedro.
81cccab4619f8d8663e1e23b769f1515?d=identicon&s=25 Kristof Bastiaensen (Guest)
on 2006-07-26 17:18
(Received via mailing list)
On Wed, 26 Jul 2006 17:47:13 +0900, Peter Hickman wrote:

> Whenever the question of performance comes up with scripting languages
> such as Ruby, Perl or Python there will be people whose response can be
> summarised as "Write it in C". I am one such person. Some people take
> offence at this and label us trolls or heretics of the true programming
> language (take your pick).
>
> <snip>

Hi,

When reading your C code, I saw that there is a lot of code that is
generated.  I'd be interested to see how well the C program does if
it can work for any size of the squares. In this case I think the
problem
is well suited for logic languages. I wrote a version in the functional
logic language Curry, which does reasonably well.  It will probably not
be
faster than the C version, but a lot faster than a program written in
Ruby/Perl/Python.

>If  you really really want that performance boost then take the following
> advice very seriously - "Write it in C".

It can be a good idea to rewrite parts in C, but I would first check if
the algorithms are good, so that it may not even be needed to write any
C code.  And perhaps there are libraries or tools that do the trick
efficiently.  I would keep writing C code as the last option.

Regards,
Kristof

-------------------- start of latin.curry ----------------------------
-- upto is a nondeterministic function that evaluates to
-- a number from 1 upto n
upto 1 = 1
upto n | n > 1 = n ? upto (n-1)

-- check if the lists r s have no element with the same value at the
-- same position
elems_diff r s = and $ zipWith (/=) r s

-- extend takes a list of columns, and extends each column with a
-- number for the next row.  It checks the number agains the column and
-- against the previous numbers in the row.

extend :: [[Int]] -> Int -> [[Int]]
extend cols n = addnum cols [] where
    addnum [] _ = []
    addnum (col:cs) prev
        | x =:= upto n &
          (x `elem` prev) =:= False &
          (x `elem` col) =:= False = (x:col) : addnum cs (x:prev)
        where x free

latin_square n = latin_square_ n
    where latin_square_ 0 = replicate n []  -- initalize columns to nil
          latin_square_ m | m > 0 = extend (latin_square_ (m-1)) n

square2str s = unlines $ map format_col s
    where format_col col = unwords $ map show col

main = mapIO_ (putStrLn . square2str) (findall (\s -> s =:= latin_square
5))
------------------------- end latin.curry -----------------------------
05e4e83c87a5700958fcb3efa8951a06?d=identicon&s=25 vasudevram (Guest)
on 2006-07-26 17:22
(Received via mailing list)
> > On Wed, 26 Jul 2006 17:47:13 +0900, Peter Hickman wrote:
> > > In this post I want to clear some things up and provide benchmarks as to
> > > why you should take "Write it in C" seriously.

Interesting series of messages! Got to save and read them through at
leisure ...

Just adding my 2c:

[I worked on a fairly complex performance tuning job once, involving
HP-UNIX boxes (multiple),
Informix ESQL/C batch programs, IBM MQ series (now called Websphere
MQ), UNIX IPC, and got a chance to do tuning at several different
levels - SQL queries, MQ logs/partitions, the C code, algorithms in it,
etc. Was very educational ... just sharing some insights gained from
that, from reading on the subject, and from smaller hobby projects
tuning my own code ...]

[Not implying that previous posters on this thread haven't done any of
the below].

Performance tuning in general is *very* complicated. Guesses or
assumptions like "this code tweak should make it run faster" often do
not work. The only way is a (semi)scientific approach to
measure/profile, study profile results, make hypotheses, change code
accordingly, then re-measure to see if the change made a difference.

Tuning can be done at any of several different levels, ranging from:

- hardware (even here, not just throwing more boxes or faster boxes at
the problem, but things like hardware architecture - obviously only if
the skills are available and the problem is worth the effort)

- software architecture

- algorithms and data structures optimization

- plain code tuning (things like common subexpression elimination, e.g.
using C syntax, changing:

for (i = 0; i < getLength(my_collection); i++) { /* do something with
my_collection[i] */ }

to

collLength = getLength(my_collection);
for (i = 0; i < collLength; i++)  { /* do something with
my_collection[i] */ }

/* which removes the repeated/redundant call to the function
getLength() */

Jon Bentley's book "Writing Efficient Programs" is a very good book
which discusses rules, examples and war stories of tuning at almost all
of these levels, including really excellent advice on code-level
tuning, which may sometimes be the easiest one to implement on existing
code.
Though the examples are in a pseudo-Pascal dialect (easily
understandable for those knowing C), and though it may be out of print
now, for those who have a real need for tuning advice, its worth trying
to get a used copy on eBay, from a friend, whatever.

Its chock-full of code examples with the tuning results (verifiied by
measurement, as stated above), when (and when not) to apply them, and
the war stories are really interesting too ...

Googling for "performance tuning" and variants thereof will help ...

There's another book (for Java, mostly server-side programming) by a
guy called Dov (something - forget his last name and the book title, if
I remember it, will post here) that's *really* excellent too - he shows
(again, with actual measurements) how some of the "expected" results
were actually wrong/counter-intuitive. He worked with IBM on the web
software for one of the recent Olympics.

HTH
Vasudev
6d9bf78ca49a017e9e3e6b0357b6c59e?d=identicon&s=25 Peter Hickman (Guest)
on 2006-07-26 17:22
(Received via mailing list)
I will run your Ruby version and the Java version that I write and post
the results here. Give us a week or so as I have other things to be
doing.
F1d37642fdaa1662ff46e4c65731e9ab?d=identicon&s=25 Charles O Nutter (Guest)
on 2006-07-26 17:25
(Received via mailing list)
On 7/26/06, Sean O'Halpin <sean.ohalpin@gmail.com> wrote:
> in an earlier post:
>
> > for memory-intensive situations Java almost always wins because lazy
> memory
> > management allows work to get done first, faster
>
> Garbage collection seems to me to be an integral part of Java's design.


Garbage collection is a characteristic of the Java VM, not the language.
Nothing in the language specifies garbage collection or how it should be
performed. It could just as easily be implemented with
reference-counting,
and the language would still look exactly the same. What IS a
characteristic
of the language is that you do not explicitly release objects when you
are
finished using them. There are many ways to interpret that and many ways
to
implement it. Garbage collection is just one among many.

Off the top of my head, I can think of some other design aspects that
> different costs and you can't ever avoid that.
You're mixing language semantics and implementation details here. The
mechanics of method lookup is not a language feature; it's an
implementation
detail. On the other hand, the logic of which method gets invoked in a
hierarchy of types is a language detail. Scoping is a language feature,
but
the means by which scope is maintained is an implementation detail.
Continuations are a language feature, but the means by which a
continuation
and its applicable scoping are memoized is an implementation detail.
Static
and dynamic typing are language features; the means by which type is
propagated and used for calls, checks, and memory allocation is an
implementation detail. On and on and on...language features always have
an
associated implementation, but there's almost always multiple ways to
implement a given feature, and those different implementations will have
their plusses and minuses.

As for the closure comment...sure, there's overhead in creating
closures,
but it's *explicit* overhead. This is no different from creating the
equivalent of closures in languages that don't support them. The concept
of
a closure has a clear specification and certainly increases the
complexity
of a language and an underlying implementation. But that complexity may
not
in itself always represent a decrease in performance, since other means
of
accomplishing the same task may be even less performant. That's how it
is
for any language feature...you take the good with the bad, and if you
don't
use all of the good you may be less-than-optimal. If using a feature has
ten
good aspects and five bad, and you only make use of five good aspects,
then
your code is sub-optimal. If you use less than five, you're even worse
off
and perhaps should consider doing things differently. Nothing about the
feature itself explicitly implies that performance should degrade by
using
it...it's a matter of using those features wisely and making optimal use
of
their good aspects, balanced with their bad aspects.
682fff6db11e1a150d6ce17f3b862448?d=identicon&s=25 Doug H (Guest)
on 2006-07-26 17:25
(Received via mailing list)
Peter Hickman wrote:
> If you really really want that performance boost then take the following
> advice very seriously - "Write it in C".

Assuming you have no choice but C/C++.   That's why I like using the
jvm or clr with languages like jruby, groovy, or boo.  You don't have
to use C, you can use java or C# or boo itself (since it is statically
typed with type inference): http://boo.codehaus.org/
or C/C++ as well, although it is 100x easier to interface with a C lib
from the clr than it is from jvm with jni.
05e4e83c87a5700958fcb3efa8951a06?d=identicon&s=25 vasudevram (Guest)
on 2006-07-26 17:38
(Received via mailing list)
Kristof Bastiaensen wrote:
> Hi,
>
> When reading your C code, I saw that there is a lot of code that is
> generated.  I'd be interested to see how well the C program does if
> it can work for any size of the squares. In this case I think the problem
> is well suited for logic languages. I wrote a version in the functional
> logic language Curry, which does reasonably well.  It will probably not be

Interesting ... I read somewhere that the OCaml language, while
higher-level than C (and a functional one too), runs some programs at
least, as fast or faster than C ...
Not sure how true that is ...

Vasudev
http://www.dancingbison.com
E1d641bfe4071a5413bac781f06d3fd1?d=identicon&s=25 Sean O'halpin (sean)
on 2006-07-26 17:41
(Received via mailing list)
On 7/26/06, Pedro Côrte-Real <pedro@pedrocr.net> wrote:
>
In theory, an infinite number of computer scientists hacking for an
infinite amount of time on a keyboard will eventually almost surely
produce a perfect compiler.

In practice, I can't wait that long ;)

Cheers,
Sean
8d16869783573d7ca80a676b65cf98e7?d=identicon&s=25 David Pollak (Guest)
on 2006-07-26 17:44
(Received via mailing list)
Writing code that runs as fast in Java as it does in C is real work,
but it's possible.

Integer (http://athena.com) is a pure Java spreadsheet.  I optimized
the numeric functions and array functions (e.g., SUM(A1:G99)) such
that Integer runs as fast or faster than Excel and OpenOffice Calc on
identical hardware.  However, it required a mind shift from "standard"
Java programming.

In addition, because Java has nice semantics for multithreading, I was
able to implement some very cleaver algorithms such that Integer's
recalculation speed scales nearly linearly with additional CPUs up to
a certain point (the linearity goes away at around 16 processors on a
big Sun box.)  But I digress.

First, I pre-allocated a lot of workspace so that there's little
memory allocation going on during recalculation.

Second, I avoid Monitors (synchronized) as much as possible.

Third, I write "C" style Java (lots of switch statements, putting
parameters and results in buffers rather than passing them on the
stack, etc.)

Memory usage in Java is higher than in C.  If Java has Structs a la
C#/Mono, it'd be possible to squeeze that much more performance from
Java.

There are some applications that will never perform as in Java (e.g.,
stuff that's heavily oriented to bit manipulation.)  But for many
classes of applications (e.g., spreadsheets) Java can perform as well
as C.

When I care about computational performance, I go with Java or in a
rare case, C++ (C style C++, no STL or virtual methods).  If I care
about developer performance, I've been going with Ruby more and more.

My 2 cents.
D3206cbdd9c086ea1ff4e9287ac2271b?d=identicon&s=25 unknown (Guest)
on 2006-07-26 17:57
(Received via mailing list)
>> > language (take your pick).
>> logic language Curry, which does reasonably well.  It will probably not
>> be
>
> Interesting ... I read somewhere that the OCaml language, while
> higher-level than C (and a functional one too), runs some programs at
> least, as fast or faster than C ...
> Not sure how true that is ...
>
> Vasudev
> http://www.dancingbison.com

You read that correctly. The problem is that nearly every benchmark I've
seen for comparing the performance of various languages has been a
repeated mathematical operation like computing a Mandelbrot Set or
running
Fibonacci Sequences that all but guarantees the edge will belong to
functional languages like Haskell and OCAML or stripped-down
assembly-like
languages like C (http://shootout.alioth.debian.org/debian/ for
samples),
because they are best suited for straight-up number crunching. Are there
good benchmarks for OO languages? Or dynamic languages? Are there good
benchmarks that could actually measure the types of uses I need, where
I'm
building a web front end to a DB store? I don't know about you, but my
job
has never involved fractals.

I used to put faith into benchmarks like this, but now I think about
developer time and maintenance time as well. That seems to be a more
intelligent approach.

Jake
31e038e4e9330f6c75ccfd1fca8010ee?d=identicon&s=25 Gregory Brown (Guest)
on 2006-07-26 18:03
(Received via mailing list)
On 7/26/06, David Pollak <pollak@gmail.com> wrote:

> In addition, because Java has nice semantics for multithreading, I was
> able to implement some very cleaver algorithms such that Integer's
> recalculation speed scales nearly linearly with additional CPUs up to
> a certain point (the linearity goes away at around 16 processors on a
> big Sun box.)  But I digress.

This must be evidence of true cutting edge development ;)
8979474815030ad4a5d59718d1905715?d=identicon&s=25 Isaac Gouy (Guest)
on 2006-07-26 18:16
(Received via mailing list)
Tomasz Wegrzanowski wrote:
> > right approach.
>
> Sorry, I just couldn't resist - but maybe you should code Java instead -
> http://kano.net/javabench/ ;-)

"The results I got were that Java is significantly faster than
optimized C++ in many cases... I've been accused of biasing the results
by using the -O2 option for GCC..."

"...so I took the benchmark code for C++ and Java from the now outdated
Great Computer Language Shootout and ran the tests myself"

Not so outdated
http://shootout.alioth.debian.org/gp4sandbox/bench...
81cccab4619f8d8663e1e23b769f1515?d=identicon&s=25 Kristof Bastiaensen (Guest)
on 2006-07-26 18:22
(Received via mailing list)
On Thu, 27 Jul 2006 00:55:05 +0900, harrisj wrote:

>> http://www.dancingbison.com
>
> You read that correctly. The problem is that nearly every benchmark I've
> seen for comparing the performance of various languages has been a
> repeated mathematical operation like computing a Mandelbrot Set or running
> Fibonacci Sequences that all but guarantees the edge will belong to
> functional languages like Haskell and OCAML or stripped-down assembly-like
> languages like C (http://shootout.alioth.debian.org/debian/ for samples),
> because they are best suited for straight-up number crunching.

In some cases the functional version is faster because the problem can
be
more easily described in a functional way.  But in general code produced
by ocaml is about twice as slow as C, because the compiler doesn't do
the
same extensive optimizations as for example gcc does.  But that's still
pretty good.

> Are there
> good benchmarks for OO languages? Or dynamic languages? Are there good
> benchmarks that could actually measure the types of uses I need, where I'm
> building a web front end to a DB store? I don't know about you, but my job
> has never involved fractals.
>
> <snip>

True, benchmarks only measure execution speed, but they don't show if a
given programmer will be productive in them.  I think that's also
largely a personal choice.  Some people may be more productive in a
functional language, some people more in Ruby. And others even in
perl... :)

Kristof
F1d37642fdaa1662ff46e4c65731e9ab?d=identicon&s=25 Charles O Nutter (Guest)
on 2006-07-26 18:26
(Received via mailing list)
On 7/26/06, Gregory Brown <gregory.t.brown@gmail.com> wrote:
>
> On 7/26/06, David Pollak <pollak@gmail.com> wrote:
>
> > In addition, because Java has nice semantics for multithreading, I was
> > able to implement some very cleaver algorithms such that Integer's
> > recalculation speed scales nearly linearly with additional CPUs up to
> > a certain point (the linearity goes away at around 16 processors on a
> > big Sun box.)  But I digress.
>
> This must be evidence of true cutting edge development ;)


No doubt, I wish I had that kind of problem when running my spreadsheet
application of choice. That's one hell of a business desktop!
8d16869783573d7ca80a676b65cf98e7?d=identicon&s=25 David Pollak (Guest)
on 2006-07-26 18:29
(Received via mailing list)
Greg,

In spreadsheets, it is cutting edge.  Name one other commercial
spreadsheet that can use more than 1 CPU?

David
9c861c5f4c053187119624ec2c779579?d=identicon&s=25 Martin Ellis (Guest)
on 2006-07-26 18:32
(Received via mailing list)
Sean O'Halpin wrote:
> In theory, an infinite number of computer scientists hacking for an
> infinite amount of time on a keyboard will eventually almost surely
> produce a perfect compiler.
>
> In practice, I can't wait that long ;)

You'd be waiting a long time indeed :o).

I believe our good friend Mr. Turing proved [1] that such
a compiler could never exist, some seven decades ago.

Oh well.

Martin


[1] OK.  That wasn't exactly what he proved.
    But only this particular corollary is relevant.
4299e35bacef054df40583da2d51edea?d=identicon&s=25 James Gray (bbazzarrakk)
on 2006-07-26 18:35
(Received via mailing list)
On Jul 26, 2006, at 11:28 AM, David Pollak wrote:

> Greg,
>
> In spreadsheets, it is cutting edge.  Name one other commercial
> spreadsheet that can use more than 1 CPU?

I'm pretty sure Greg was funning around with the comical typo in your
post.  Take a look back at how you spelled "clever."  ;)

James Edward Gray II
F1d37642fdaa1662ff46e4c65731e9ab?d=identicon&s=25 Charles O Nutter (Guest)
on 2006-07-26 18:38
(Received via mailing list)
I think it's also worth mentioning that talking about the speed of a
*language* is rather absurd. You can't benchmark a language, other than
through mathematical proofs of complexity. What you can benchmark is an
implementation or compiler of a language, which will of course vary
widely
from version to version. Ruby's current implementation may be slower
than
many other language implementations, but it's not because it's Ruby. C
may
be faster than just about anything, but it's not because of C. It's
because
the compilers and interpreters for those other languages are unable to
produce code as optimally as current C compilers--which isn't surprising
considering how long C compilers have been around and how little work
they
actually have to do.

The whole "write it in C" thing really ends up being a cop-out. So the
current C Ruby implementation isn't fast enough for you? Contribute your
time and resources and fix the implementation! Don't bloody patch around
it
by leaving Ruby behind and writing C code! Demand more from your
platform!
The community and the fates will thank you.
Cb48ca5059faf7409a5ab3745a964696?d=identicon&s=25 unknown (Guest)
on 2006-07-26 18:42
(Received via mailing list)
On Wed, 26 Jul 2006, Peter Hickman wrote:

> you should take "Write it in C" seriously. Personally I like to write my
> The approach taken is to create a list of all the permutations and then
> build up a grid row by row checking that the newly added row does not
> conflict with any of the previous rows. If the final row can be added
> without problem the solution is printed and the search for the next one
> starts. It is in essence depth first search. The first version of the
> program that I wrote in Perl took 473 minutes to generate all the valid 5 x
> 5 Latin squares, version four of the program took 12 minutes and 51 seconds.
> The C version of the program took 5.5 seconds to produce identical results.
> All run on the same hardware.

just for fun, here's a ruby version (note that the array would actually
need
to be reformed into rows, but someone else can play with that)

   harp:~ > cat a.rb
   require 'gsl'

   n = Integer(ARGV.shift || 2)

   width, height = n, n

   perm = GSL::Permutation.alloc width * height

   p perm.to_a until perm.next == GSL::FAILURE


it's not terribly fast to run - but it was to write!

-a
31e038e4e9330f6c75ccfd1fca8010ee?d=identicon&s=25 Gregory Brown (Guest)
on 2006-07-26 18:59
(Received via mailing list)
On 7/26/06, James Edward Gray II <james@grayproductions.net> wrote:
> On Jul 26, 2006, at 11:28 AM, David Pollak wrote:
>
> > Greg,
> >
> > In spreadsheets, it is cutting edge.  Name one other commercial
> > spreadsheet that can use more than 1 CPU?
>
> I'm pretty sure Greg was funning around with the comical typo in your
> post.  Take a look back at how you spelled "clever."  ;)

James gets right to the point.  I was just taking a slice at your
typo, not Integer. :)
8d16869783573d7ca80a676b65cf98e7?d=identicon&s=25 David Pollak (Guest)
on 2006-07-26 19:51
(Received via mailing list)
Guess I should unit test my posts... :-)
Fd22ee3cfc7dac283ce8e451af324f7d?d=identicon&s=25 Chad Perrin (Guest)
on 2006-07-26 20:21
(Received via mailing list)
On Wed, Jul 26, 2006 at 11:29:06PM +0900, Ryan McGovern wrote:
> >
> I dont doubt for simple applications and algorithms java is nearly as
> fast as C if not equivalent.  Though for larger java projects such as
> Eclipse, i've had a horrible time of it being slow and cumbersome on the
> system, and Visual Studio will run fine and be far more responsive.
> I dont really know why that is it could be as simple as some bad code in
> the java gui layer that Eclipse is using.

Doubtful.  Java does generally produce notably faster applications than
Ruby, and there are benchmarks that show that in specific instances it
can hustle almost as well as C -- even scalably so.  A more
comprehensive survey of benchmarks, on the other hand, starts to take
its toll on Java's reputation for speed.  The first problem is that C
isn't object oriented and, while OOP can be great for programmer
productivity under many circumstances (particularly involving larger
projects), it introduces a bunch of interface activity between parts of
the program which begins to slow things down.  Furthermore, Java's
bytecode-compilation and VM interpretation can increase execution speed
at runtime by cutting out part of the process of getting from source to
binary, but it still requires interpretation and relies on the
performance of the VM itself (which is, sad to say, not as light on its
feet as many would like).

In fact, there are cases where the Perl runtime compiler's quickness
makes Java's VM look dog-slow.  If your purpose for using a language
other than whatever high-level language you prefer is maximum
performance (presumably without giving up readable source code), Java
isn't an ideal choice.  If your high-level language of choice is Perl,
there's actually very little reason for Java at all, and the same is
true of some Lisp interpreters/compilers.

For those keen on functional programming syntax, Haskell is a better
choice than Java for performance: in fact, the only thing keeping
Haskell from performing as well as C, from what I understand, is the
current state of processor design.  Similarly, O'Caml is one of the
fastest non-C languages available: it consistently, in a wide range of
benchmark tests and real-world anecdotal comparisons, executes "at least
half as quickly" as C, which is faster than it sounds.

The OP is right, though: if execution speed is your top priority, use C.
Java is an also-ran -- what people generally mean when they say that
Java is almost as fast as C is that a given application written in both
C and Java "also runs in under a second" in Java, or something to that
effect.  While that may be true, there's a significant difference
between 0.023 seconds and 0.8 seconds (for hypothetical example).
Fd22ee3cfc7dac283ce8e451af324f7d?d=identicon&s=25 Chad Perrin (Guest)
on 2006-07-26 20:29
(Received via mailing list)
On Thu, Jul 27, 2006 at 12:42:46AM +0900, David Pollak wrote:
> Writing code that runs as fast in Java as it does in C is real work,
> but it's possible.

. . . the problem being that putting the same effort into optimizing a C
program will net greater performance rewards as well.  The only language
I've ever seen stand up to C in head-to-head optimization comparisons
with any consistency, and even outperform it, was Delphi-style Objective
Pascal, and that's only anecdotal comparisons involving my father (who
knows Delphi's Objective Pascal better than most lifelong C programmers
know C), so the comparisons might be somewhat skewed.  My suspicion is
that the compared C code can be tweaked to outperform the Object Pascal
beyond Object Pascal's ability to be tweaked for performance -- the
problem being that eventually you have to stop tweaking your code, so
sometimes the Object Pascal might be better anyway.

Java doesn't even come close to that level of performance optimization,
alas.  At least, not from what I've seen.


>
> There are some applications that will never perform as in Java (e.g.,
> stuff that's heavily oriented to bit manipulation.)  But for many
> classes of applications (e.g., spreadsheets) Java can perform as well
> as C.

Is that heavily optimized Java vs. "normal" (untweaked) C?
Fd22ee3cfc7dac283ce8e451af324f7d?d=identicon&s=25 Chad Perrin (Guest)
on 2006-07-26 20:45
(Received via mailing list)
On Wed, Jul 26, 2006 at 11:26:45PM +0900, Charles O Nutter wrote:
> >However I will write a Java version of the code and time it. It should
> >be interesting to say the least.
>
> Doubt all you like.

Full disclaimer included:
As someone who is NOT paid to program in Java, and in fact finds Java
rather odious, and would rather write code in almost anything else that
isn't the annoyance factor equivalent of VB, I too doubt it.  Aside from
not being paid to program in Java, though, I have played with Java code,
I have researched Java performance characteristics extensively in the
performance of job tasks, I've looked at hundreds of benchmarks over the
years, and I know a fair bit about programming language interpreters
and parsers in the abstract.  The end result is that characterizing Java
as "at least as fast as C" in most cases and faster in many other cases
sounds like a load of (perhaps well-meaning) hooey to me.


> hope and that doesn't mean they shouldn't be frustrated when the stock
> little more than a set of macros on top of assembly code. If the underlying
> processor ran YARV bytecodes, I doubt Ruby performance would be a concern.

I'd say "yes and no" to that.  There are things about Ruby -- things
that I absolutely would not change -- that necessitate slower runtime
execution.  For instance, for Ruby to work as specced, it needs dynamic
typing, which is simply slower in execution, because typing becomes a
part of execution.  Static typing doesn't require types to be set at
execution: they can be set at compile-time, because they don't have to
change depending on runtime conditions.  Thus, you add an extra step to
runtime execution a variable (pardon the pun) number of times.  It's an
unavoidable execution-speed loss based on how the Ruby language is
supposed to work, and it's a characteristic of Ruby that I absolutely
would not throw away for better runtime performance.  Because of this,
of course, it is effectively impossible to use persistent compiled
executables for Ruby to solve the runtime execution performance gap that
is introduced by dynamic typing as well.  C'est la vie.

Other, similar difficulties arise as well.  Ultimately, it's not the
fact that it's an interpreted language that is the problem.  That can be
solved via a number of tricks (just-in-time compilation similar to Perl,
bytecode compilation, or even simply writing a compiler for it, for
example), if that's the only problem.  The main problem is that, like
Perl, Python, and various Lisps, it's a dynamic language: it can be used
to write programs that change while they're running.  To squeeze the
same performance out of Ruby that you get out of C, you'd have to remove
its dynamic characteristics, and once you do that you don't have Ruby
any longer.
A9c4658e9e475e13d790ae419acf01b6?d=identicon&s=25 Simon Kröger (Guest)
on 2006-07-26 20:48
(Received via mailing list)
Peter Hickman wrote:
> I will run your Ruby version and the Java version that I write and post
> the results here. Give us a week or so as I have other things to be doing.

Hmm, in a week this discussion will be over (ok, it will reappear some
time
soon, but nevertheless) and everybody has swallowed your points.

$ ruby -v
ruby 1.8.4 (2005-12-24) [i386-mingw32]

$ time ruby latin.rb 5 > latin.txt

real    0m4.703s
user    0m0.015s
sys     0m0.000s

(this is a 2.13GHz PentiumM, 1GB RAM, forget the user and sys timings,
but
'real' is for real, this is WinXP)

My point is: If you choose the right algorithm, your program will get
faster by
orders of magnitudes - spending time optimizing algorithms is better
than
spending the same time recoding everything in C. In a not so distance
future
(when the interpreter is better optimized or perhaps YARV sees the light
of day
my version will be even faster than yours. It will be easier to maintain
and
more platform independent.

Of course you can port this enhanced version to C and it will be even
faster,
but if you have a limited amount of time/money to spend on optimization
i would
say: go for the algorithm.

To stop at least some of the flames: I like Extensions, i like them most
if
they are generally useful (and fast) like gsl, NArray, whatever. The
combination of such Extensions and optimized algorithms (written in
ruby) would
be my preferred solution if i had a performance critical problem that
I'm
allowed to tackle in ruby.

cheers

Simon

p.s.: and if my solution is only that fast because of a bug (i really
hope
not), i think my conclusions still hold true.
9c861c5f4c053187119624ec2c779579?d=identicon&s=25 Martin Ellis (Guest)
on 2006-07-26 20:57
(Received via mailing list)
Chad Perrin wrote:
> Haskell is a better choice than Java for performance:

I suspect it depends what you're doing...

> in fact, the only thing keeping
> Haskell from performing as well as C, from what I understand, is the
> current state of processor design.

I'm interested to know more about that.
Could you elaborate?  A reference would do.

Cheers
Martin
Fd22ee3cfc7dac283ce8e451af324f7d?d=identicon&s=25 Chad Perrin (Guest)
on 2006-07-26 21:01
(Received via mailing list)
On Thu, Jul 27, 2006 at 12:23:23AM +0900, Charles O Nutter wrote:
>
> You're mixing language semantics and implementation details here. The
> mechanics of method lookup is not a language feature; it's an implementation
> detail. On the other hand, the logic of which method gets invoked in a
> hierarchy of types is a language detail. Scoping is a language feature, but
> the means by which scope is maintained is an implementation detail.

In some ways, you're right: implementation details are being mixed up
with language definition in the preceding list of features.  In the case
of scoping, however, you're not entirely right with regard to "the means
by which scope is maintained".  Dynamic scoping, by definition, requires
runtime scoping.  Static scoping, by definition, does not.  This means
that (to use Perl as an example, since I know it better than Ruby) my(),
which is used to declare variables in lexical scope, can be managed at
compile time, while local(), which is used for dynamic scope, can only
be managed at runtime -- else it will not work as advertised.  That's
more than simply implementation details: implementation is, in this
case, dictated by language features.


> good aspects and five bad, and you only make use of five good aspects, then
> your code is sub-optimal. If you use less than five, you're even worse off
> and perhaps should consider doing things differently. Nothing about the
> feature itself explicitly implies that performance should degrade by using
> it...it's a matter of using those features wisely and making optimal use of
> their good aspects, balanced with their bad aspects.

I think closures are kind of a bad example for this, actually.  There's
nothing about closures that necessarily harms performance of the
language in implementation.  In fact, closures are in some respects
merely a happy accident that arises as the result of other, positive
characteristics of a language that all can tend to contribute to better
performance of the implementation of a language (such as lexical scope,
which leads to better performance than dynamic scope).  In fact, one of
the reasons Python doesn't have proper closures (lack of strict lexical
scoping) is also likely one of the reasons Python still tends to lag
behind Perl for performance purposes.

The only real overhead involved in closures, as far as I can see, is the
allocation of memory to a closure that doesn't go away until the program
exits or, in some implementations, until the program reaches a point
where it will absolutely, positively never need that closure again
(which is roughly the same thing for most uses of closures).  A little
extra memory usage does not translate directly to performance loss.  In
fact, in any halfway-decent system implementation, it really shouldn't
result in reduced performance unless you start having to swap because
you've overrun "physical RAM", I think.

The day may come when RAM is better managed so that performance gains
can be had for less memory usage, though, so I doubt this will always be
true.
Fd22ee3cfc7dac283ce8e451af324f7d?d=identicon&s=25 Chad Perrin (Guest)
on 2006-07-26 21:04
(Received via mailing list)
On Thu, Jul 27, 2006 at 12:03:50AM +0900, Dean Wampler wrote:
> This is the "value proposition" of the "Hot Spot" technology in the
> Java Virtual Machine. On the fly, it looks for byte code sections that
> get executed repeatedly and it then compiles them to object code,
> thereby doing runtime optimization. This allows many Java server
> processes to run with near-native speeds. When Ruby runs on a virtual
> machine, planned for version 2, then Ruby can do that too. The JRuby
> project will effectively accomplish the same goal.

This recent mania for VMs is irksome to me.  The same benefits can be
had from a JIT compiler, without the attendant downsides of a VM (such
as greater persistent memory usage, et cetera).
Fd22ee3cfc7dac283ce8e451af324f7d?d=identicon&s=25 Chad Perrin (Guest)
on 2006-07-26 21:20
(Received via mailing list)
On Thu, Jul 27, 2006 at 03:55:10AM +0900, Martin Ellis wrote:
> Chad Perrin wrote:
> > Haskell is a better choice than Java for performance:
>
> I suspect it depends what you're doing...

To clarify: I meant "on average" or "in general".  Obviously, there will
be instances where Java will outperform Haskell or, for that matter,
even C -- just as there are times Perl can outperform C, for an
equivalent amount of invested programmer time, et cetera.  I suspect the
same is true even of Ruby, despite its comparatively crappy execution
speed.  That doesn't change the fact that in the majority of cases,
Haskell will outperform most other languages.  It is, after all, the C
of functional programming.

>
> > in fact, the only thing keeping
> > Haskell from performing as well as C, from what I understand, is the
> > current state of processor design.
>
> I'm interested to know more about that.
> Could you elaborate?  A reference would do.

I'm having difficulty finding citations for this that actually explain
anything, but the short and sloppy version is as follows:

Because imperative style programming had "won" the programming paradigm
battle back in the antediluvian days of programming, processors have
over time been oriented more and more toward efficient execution of code
written in that style.  When a new processor design and a new
instruction set for a processor is shown to be more efficient in code
execution, it is more efficient because it has been better architected
for the software that will run on it, to better handle the instructions
that will be given to it with alacrity.  Since almost all programs
written today are written in imperative, rather than functional, style,
this means that processors are optimized for execution of imperative
code (or, more specifically, execution of binaries that are compiled
from imperative code).

As a result, functional programming languages operate at a slight
compilation efficiency disadvantage -- a disadvantage that has been
growing for decades.  There are off-hand remarks all over the web about
how functional programming languages supposedly do not compile as
efficiently as imperative programming languages, but these statements
only tell part of the story: the full tale is that functional
programming languages do not compile as efficiently on processors
optimized for imperative-style programming.

We are likely heading into an era where that will be less strictly the
case, however, and functional languages will be able to start catching
up, performance-wise.  Newer programming languages are beginning to get
further from their imperative roots, incorporating more characteristics
of functional-style languages (think of Ruby's convergence on Lisp, for
instance).  For now, however, O'Caml and, even moreso, Haskell suffer at
a disadvantage because their most efficient execution environment isn't
available on our computers.
Fd22ee3cfc7dac283ce8e451af324f7d?d=identicon&s=25 Chad Perrin (Guest)
on 2006-07-26 21:27
(Received via mailing list)
On Wed, Jul 26, 2006 at 09:26:05PM +0900, Leslie Viljoen wrote:
>
> Something else to consider is the ease with which Ruby extensions can
> be written in C. The first time I tried I has something running in 20
> minutes.
>
> Though if I was going to choose a (single) language for raw
> performance I'd try to go with Pascal or Ada.

Pascal's sort of an iffy proposition for me, in comparison with C.  I'm
simply not sure that it can be optimized as thoroughly as C, in any
current implementations.  According to its spec, it can probably
outperform C if implemented well, and Borland Delphi does a reasonably
good job of that, but it has received considerably less attention from
compiler programmers over time and as such is probably lagging in
implementation performance.  It's kind of a mixed bag, and I'd like to
get more data on comparative performance characteristics than I
currently have.

Ada, on the other hand -- for circumstances in which it is most commonly
employed (embedded systems, et cetera), it does indeed tend to kick C's
behind a bit.  That may have more to do with compiler optimization than
language spec, though.
Fd22ee3cfc7dac283ce8e451af324f7d?d=identicon&s=25 Chad Perrin (Guest)
on 2006-07-26 21:30
(Received via mailing list)
On Wed, Jul 26, 2006 at 08:30:11PM +0900, Jay Levitt wrote:
> On Wed, 26 Jul 2006 17:47:13 +0900, Peter Hickman wrote:
>
> > In this post I want to clear some things up and provide benchmarks as to
> > why you should take "Write it in C" seriously.
>
> This is a great post, and should at least be posted to a blog somewhere so
> the masses who don't know about USENET can still find it on Google!

This list is not only on USENET, for what it's worth.
A90204c955db033cd975f7bb0ec9600b?d=identicon&s=25 Ashley Moran (Guest)
on 2006-07-26 21:30
(Received via mailing list)
On Jul 26, 2006, at 8:03 pm, Chad Perrin wrote:

> This recent mania for VMs is irksome to me.  The same benefits can be
> had from a JIT compiler, without the attendant downsides of a VM (such
> as greater persistent memory usage, et cetera).

Chad,

I'm late to this conversation but I've been interested in Ruby
performance lately.  I just had to write a script to process about
1-1.5GB of CSV data  (No major calculations, but it involves about 20
million rows, or something in that region).  The Ruby implementation
I wrote takes about 2.5 hours to run - I think memory management is
the main issue as the manual garbage collection run I added after
each file goes into several minutes for the larger sets of data.  As
you can imagine, I am more than eager for YARV/Rite.

Anyway, my question really is that I thought a VM was a prerequisite
or JIT?  Is that not the case?  And if the YARV VM is not the way to
go, what is?

Ashley
Fd22ee3cfc7dac283ce8e451af324f7d?d=identicon&s=25 Chad Perrin (Guest)
on 2006-07-26 21:33
(Received via mailing list)
On Thu, Jul 27, 2006 at 03:47:48AM +0900, Simon Kröger wrote:
> $ time ruby latin.rb 5 > latin.txt
>
> real    0m4.703s
> user    0m0.015s
> sys     0m0.000s
>
> (this is a 2.13GHz PentiumM, 1GB RAM, forget the user and sys timings, but
> 'real' is for real, this is WinXP)

Holy crap, that's fast.
Fd22ee3cfc7dac283ce8e451af324f7d?d=identicon&s=25 Chad Perrin (Guest)
on 2006-07-26 21:43
(Received via mailing list)
On Thu, Jul 27, 2006 at 01:20:05AM +0900, Kristof Bastiaensen wrote:
>
> True, benchmarks only measure execution speed, but they don't show if a
> given programmer will be productive in them.  I think that's also
> largely a personal choice.  Some people may be more productive in a
> functional language, some people more in Ruby. And others even in perl... :)

Actually, a well-defined functional syntax is a beautiful thing to
behold.  UCBLogo, of all things, taught me that -- and taught me to
dearly love arithmetic prefix notation.  Ruby's .method syntax is also
pretty darned nice, but I'd like to see it slightly more consistently
applied (only slightly).  There's more to syntactic and semantic style
than mere personal preference: there are concrete benefits to a
functional syntax in terms of writing consistent code, for instance.

. . . and Perl is great.  Let's not knock it just because Ruby is great
too.  Well, maybe in jest, as you've done, but it gets a far worse
reputation than it deserves.  As Paul Graham put it, ugly code is a
result of being forced to use the wrong concepts to achieve something
specific, and not of a harsh-looking syntax.  Any syntax can be
harsh-looking to someone unaccustomed to it (even Ruby's).
D36eff3004b39abc4b93fe8a410d8bd3?d=identicon&s=25 Ron M (Guest)
on 2006-07-26 21:50
(Received via mailing list)
Charles O Nutter wrote:
> I'll lob a couple of grenades and then duck for cover.
>
> - Write it in C is as valid as write it in Java (as someone else
> mentioned).

Not really.  In C you can quite easily use inline assembly
to do use your chips MMX/SSE/VIS/AltiVec extensions and if
you need more, interface to your GPU if you want to use it
as a coprocessor.

I don't know of any good way of doing those in Java except
by writing native extensions in C or directly with an assembler.

Last I played with Java it didn't have a working cross-platform
mmap, and if that's still true, the awesome NArray+mmap Ruby
floating around is a good real-world example of this flexibility.
Cb48ca5059faf7409a5ab3745a964696?d=identicon&s=25 unknown (Guest)
on 2006-07-26 21:56
(Received via mailing list)
On Wed, 26 Jul 2006, Kroeger, Simon (ext) wrote:

> $filter = $perms.map do |p|
>    search lines + [p], (possibs -
>  end
>> the following
>> advice very seriously - "Write it in C".
>
> Agreed, 100%, for those who want speed, speed and nothing
> else there is hardly a better way.
>
> thanks
>
> Simon

harp:~ > time ruby latin.rb 5 > 5.out
real    0m11.170s
user    0m10.840s
sys     0m0.040s

harp:~ > uname -srm
Linux 2.4.21-40.EL i686

harp:~ > cat /proc/cpuinfo
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 15
model           : 2
model name      : Intel(R) Pentium(R) 4 CPU 2.40GHz
stepping        : 7
cpu MHz         : 2386.575
cache size      : 512 KB
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 2
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm
bogomips        : 4757.91

harp:~ > ruby -v
ruby 1.8.4 (2005-12-01) [i686-linux]


not too shabby.  definite not worth the hassle for 5 seconds of c.

-a
481b8eedcc884289756246e12d1869c1?d=identicon&s=25 Francis Cianfrocca (blackhedd)
on 2006-07-26 21:58
Ashley Moran wrote:
> performance lately.  I just had to write a script to process about
> 1-1.5GB of CSV data  (No major calculations, but it involves about 20
> million rows, or something in that region).

I've had tremendous results optimizing Ruby programs that process huge
piles of text. There is a range of "tricks" you can use to keep Ruby
from wasting memory, which is its real downfall. If it's possible, given
your application, to process your CSV text in such a way that you don't
store any transformations of the whole set in memory at once, you'll go
orders of magnitude faster. You can even try to break your computation
up into multiple stages, and stream the intermediate results out to
temporary files. As ugly as that sounds, it will be far faster.

In regard to the whole conversation on this thread: at the end of the
day, absolute performance only matters if you can put a dollar amount on
it. That makes the uncontexted language comparisons essentially
meaningless.

In regard to YARV: I get a creepy feeling about anything that is
considered by most of the world to be the prospective answer to all
their problems. And as a former language designer, I have some reasons
to believe that a VM will not be Ruby's performance panacea.
Cb48ca5059faf7409a5ab3745a964696?d=identicon&s=25 unknown (Guest)
on 2006-07-26 22:15
(Received via mailing list)
On Thu, 27 Jul 2006, Francis Cianfrocca wrote:

> In regard to YARV: I get a creepy feeling about anything that is
> considered by most of the world to be the prospective answer to all
> their problems. And as a former language designer, I have some reasons
> to believe that a VM will not be Ruby's performance panacea.

one of the reasons i've been pushing so hard for an msys based ruby is
that
having a 'compilable' ruby on all platforms might open up developement
on jit
type things like ruby inline - which is pretty dang neat.

2 cts.

-a
481b8eedcc884289756246e12d1869c1?d=identicon&s=25 Francis Cianfrocca (blackhedd)
on 2006-07-26 22:15
Ron M wrote:
 > Not really.  In C you can quite easily use inline assembly
> to do use your chips MMX/SSE/VIS/AltiVec extensions and if
> you need more, interface to your GPU if you want to use it
> as a coprocessor.
>
> I don't know of any good way of doing those in Java except
> by writing native extensions in C or directly with an assembler.
>
> Last I played with Java it didn't have a working cross-platform
> mmap, and if that's still true, the awesome NArray+mmap Ruby
> floating around is a good real-world example of this flexibility.


Your point about Java here is very well-taken. I'd add that you don't
even really need to drop into asm to get most of the benefits you're
talking about. C compilers are really very good at optimizing, and I
think you'll get nearly all of the available performance benefits from
well-written C alone. (I've written at least a million lines of
production asm code in my life, as well as a pile of commercial
compilers for various languages.) It goes back to economics again. A
very few applications will gain so much incremental value from the extra
5-10% performance boost that you get from hand-tuned asm, that it's
worth the vastly higher cost (development, maintenance, and loss of
portability) of doing the asm. A tiny number of pretty unusual apps
(graphics processing, perhaps) will get a lot more than 10% from asm.

The performance increment in going from Ruby to C is in *many* cases a
lot more than 10%, in fact it can easily be 10,000%.
Fd22ee3cfc7dac283ce8e451af324f7d?d=identicon&s=25 Chad Perrin (Guest)
on 2006-07-26 22:15
(Received via mailing list)
On Thu, Jul 27, 2006 at 04:27:57AM +0900, Ashley Moran wrote:
> Anyway, my question really is that I thought a VM was a prerequisite
> or JIT?  Is that not the case?  And if the YARV VM is not the way to
> go, what is?

The canonical example for comparison, I suppose, is the Java VM vs. the
Perl JIT compiler.  In Java, the source is compiled to bytecode and
stored.  In Perl, the source remains in source form, and is stored as
ASCII (or whatever).  When execution happens with Java, the VM actually
interprets the bytecode.  Java bytecode is compiled for a virtual
computer system (the "virtual machine"), which then runs the code as
though it were native binary compiled for this virtual machine.  That
virtual machine is, from the perspective of the OS, an interpreter,
however.  Thus, Java is generally half-compiled and half-interpreted,
which speeds up the interpretation process.

When execution happens in Perl 5.x, on the other hand, a compiler runs
at execution time, compiling executable binary code from the source.  It
does so in stages, however, to allow for the dynamic runtime effects of
Perl to take place -- which is one reason the JIT compiler is generally
preferable to a compiler of persistent binary executables in the style
of C.  Perl is, thus, technically a compiled language, and not an
interpreted language like Ruby.

Something akin to bytecode compilation could be used to improve upon the
execution speed of Perl programs without diverging from the
JIT-compilation execution it currently uses and also without giving up
any of the dynamic runtime capabilities of Perl.  This would involve
running the first (couple of) pass(es) of the compiler to produce a
persistent binary compiled file with the dynamic elements still left in
an uncompiled form, to be JIT-compiled at execution time.  That would
probably grant the best performance available for a dynamic language,
and would avoid the overhead of a VM implementation.  It would, however,
require some pretty clever programmers to implement in a sane fashion.

I'm not entirely certain that would be appropriate for Ruby, considering
how much of the language ends up being dynamic in implementation, but it
bothers me that it doesn't even seem to be up for discussion.  In fact,
Perl is heading in the direction of a VM implementation with Perl 6,
despite the performance successes of the Perl 5.x compiler.  Rather than
improve upon an implementation that is working brilliantly, they seem
intent upon tossing it out and creating a different implementation
altogether that, as far as I can see, doesn't hold out much hope for
improvement.  I could, of course, be wrong about that, but that's how it
looks from where I'm standing.

It just looks to me like everyone's chasing VMs.  While the nontrivial
problems with Java's VM are in many cases specific to the Java VM (the
Smalltalk VMs have tended to be rather better designed, for instance),
there are still issues inherent in the VM approach as currently
envisioned, and as such it leaves sort of a bad taste in my mouth.

I think I've rambled.  I'll stop now.
Fd22ee3cfc7dac283ce8e451af324f7d?d=identicon&s=25 Chad Perrin (Guest)
on 2006-07-26 22:16
(Received via mailing list)
On Thu, Jul 27, 2006 at 04:59:13AM +0900, Francis Cianfrocca wrote:
> orders of magnitude faster. You can even try to break your computation
> up into multiple stages, and stream the intermediate results out to
> temporary files. As ugly as that sounds, it will be far faster.

One of these days, I'll actually know enough Ruby to be sure of what
language constructs work for what purposes in terms of performance.  I
rather suspect there are prettier AND better-performing options than
using temporary files to store data during computation, however.
1b5341b64f7ce0244366eae17f06c801?d=identicon&s=25 unknown (Guest)
on 2006-07-26 22:25
(Received via mailing list)
On Thu, 27 Jul 2006, Ashley Moran wrote:

> I'm late to this conversation but I've been interested in Ruby performance
> lately.  I just had to write a script to process about 1-1.5GB of CSV data

Just as a sidenote to this conversation, if you are not using FasterCSV,
take a look at it.  http://rubyforge.org/projects/fastercsv

Using it may dramatically speed your script.


Kirk Haines
481b8eedcc884289756246e12d1869c1?d=identicon&s=25 Francis Cianfrocca (blackhedd)
on 2006-07-26 22:31
Chad Perrin wrote:
> On Thu, Jul 27, 2006 at 04:59:13AM +0900, Francis Cianfrocca wrote:
>> orders of magnitude faster. You can even try to break your computation
>> up into multiple stages, and stream the intermediate results out to
>> temporary files. As ugly as that sounds, it will be far faster.
>
> One of these days, I'll actually know enough Ruby to be sure of what
> language constructs work for what purposes in terms of performance.  I
> rather suspect there are prettier AND better-performing options than
> using temporary files to store data during computation, however.

Ashley was talking about 1GB+ datasets, iirc. I'd love to see an
in-memory data structure (Ruby or otherwise) that can slug a few of
those around without breathing hard. And on most machines, you're going
through the disk anyway with a dataset that large, as it thrashes your
virtual-memory. So why not take advantage of the tunings that are built
into the I/O channel?

If I'm using C, I always handle datasets that big with the kernel vm
functions- generally faster than the I/O functions. I don't know how to
do that portably in Ruby (yet).
8979474815030ad4a5d59718d1905715?d=identicon&s=25 Isaac Gouy (Guest)
on 2006-07-26 22:50
(Received via mailing list)
Chad Perrin wrote:
> On Wed, Jul 26, 2006 at 11:29:06PM +0900, Ryan McGovern wrote:
-snip-
> For those keen on functional programming syntax, Haskell is a better
> choice than Java for performance: in fact, the only thing keeping
> Haskell from performing as well as C, from what I understand, is the
> current state of processor design.  Similarly, O'Caml is one of the
> fastest non-C languages available: it consistently, in a wide range of
> benchmark tests and real-world anecdotal comparisons, executes "at least
> half as quickly" as C, which is faster than it sounds.

For those keen on functional programming, Clean produces small fast
executables.

> The OP is right, though: if execution speed is your top priority, use C.
> Java is an also-ran -- what people generally mean when they say that
> Java is almost as fast as C is that a given application written in both
> C and Java "also runs in under a second" in Java, or something to that
> effect.  While that may be true, there's a significant difference
> between 0.023 seconds and 0.8 seconds (for hypothetical example).

That sounds wrong to me - I hear positive comments about Java
performance for long-running programs, not for programs that run in
under a second.
C1bcb559f87f356698cfad9f6d630235?d=identicon&s=25 Hal Fulton (Guest)
on 2006-07-26 22:57
(Received via mailing list)
Isaac Gouy wrote:
>
> That sounds wrong to me - I hear positive comments about Java
> performance for long-running programs, not for programs that run in
> under a second.
>

JIT is the key to a lot of that. Performance depends greatly on
the compiler, the JVM, the algorithm, etc.

I won a bet once from a friend. We wrote comparable programs in
Java and C++ (some arbitrary math in a loop running a bazillion
times).

With defaults on both compiles, the Java was actually *faster*
than the C++. Even I didn't expect that. But as I said, this
sort of thing is highly dependent on many different factors.


Hal
F1d37642fdaa1662ff46e4c65731e9ab?d=identicon&s=25 Charles O Nutter (Guest)
on 2006-07-26 23:03
(Received via mailing list)
On 7/26/06, Ron M <rm_rails@cheapcomplexdevices.com> wrote:
> as a coprocessor.
>
> I don't know of any good way of doing those in Java except
> by writing native extensions in C or directly with an assembler.
>

So you're saying that when writing an extension to Ruby in C you can
also
manually write assembly to speed up specific aspects of your code, and
that
when writing an extrenion in Java you'd have to manually write assembly
to
speed up specific aspect of your code? The great hassle of writing Java
Native Interface code aside (which is really a one-time cost) what
exactly
is the difference here?

On any platform, Java included, you can eventually call out to C code to
do
some processor specific or really performance-intensive task. Java
doesn't
make it as easy as Ruby, but it also performs quite a bit better than
Ruby
for most cases. It's only in rare cases that you actually need to write
native code to make a Java app perform well. However in the Ruby world,
that
tends to be the stock answer...if it's not fast enough, give up on Ruby!

I can absolutely appreciate the gains shown by moving targetted pieces
of
code from Ruby to C. In those examples, Ruby's power is grossly
underutilized, so the conversion to a less feature-rich language with
less
overhead makes a great deal of sense. However I would challenge the Ruby
community at large to expect more from Ruby proper before giving up the
dream of highly-performant Ruby code and plunging into the C.
A90204c955db033cd975f7bb0ec9600b?d=identicon&s=25 Ashley Moran (Guest)
on 2006-07-26 23:19
(Received via mailing list)
On Jul 26, 2006, at 9:31 pm, Francis Cianfrocca wrote:

> functions- generally faster than the I/O functions. I don't know
> how to
> do that portably in Ruby (yet).


I think the total data size is about 1.5GB, but the individual files
are smaller, the largest being a few hundred GB.  The most rows in a
file is ~15,000,000 I think.  The server I run it on has 2GB RAM (an
Athlon 3500+ running FreeBSD/amd64, so the hardware is not really an
issue)... it can get all the way through without swapping (just!)

The processing is pretty trivial, and mainly involves incrementing
some ID columns so we can merge datasets together, adding a text
column to the start of every row, and eliminating a few duplicates.
The output file is gzipped (sending the output of CSV::Writer through
GzipWriter).  I could probably rewrite it so that most files are
output a line at a time, and call out to the command line gzip.  Only
the small files *need* to be stored in RAM for duplicate removal,
others are guaranteed unique.  At the time I didn't think using RAM
would give such a huge performance hit (lesson learnt).

I might also look into Kirk's suggestion of FasterCSV.  If all this
doesn't improve things, there's always the option of going dual-core
and forking to do independent files.

However... the script can be run at night so even in its current
state it's acceptable.  It will only need serious work if we start
adding many more datasets into the routine (we're using two out of a
conceivable 4 or 5, I think).  In that case we could justify buying a
faster CPU if it got out of hand, rather than rewrite it in C.  But
that's more a reflection of hardware prices than my wages :)

I have yet to write anything in Ruby was less than twice as fast to
code as it would have been in bourne-sh/Java/whatever, never mind
twice as fun or maintainable.  I recently rewrote an 830 line Java/
Hibernate web service client as 67 lines of Ruby, in about an hour.
With that kind of productivity, performance can go to hell!

Ashley
481b8eedcc884289756246e12d1869c1?d=identicon&s=25 Francis Cianfrocca (blackhedd)
on 2006-07-26 23:24
Charles O Nutter wrote:
> I would challenge the Ruby
> community at large to expect more from Ruby proper before giving up the
> dream of highly-performant Ruby code and plunging into the C.

Much depends on what is wanted from the language. My friends know me for
a person who will gladly walk a very long way to get an incremental
performance improvement in any program. But I don't dream of
highly-performant Ruby code. I dream of highly-scalable applications
that can work with many different kinds of data seamlessly and link
business people and their customers together in newer, faster, more
secure ways than have ever been imagined before. I want to be able to
turn almost any kind of data, wherever it is, into actionable
information and combine it flexibly with any other data. I want to be
able to simply drop any piece of new code into a network and
automatically have it start working with other components in the
(global) network. I want a language system that can gracefully and
powerfully model all of these new kinds of interactions without
requiring top-down analysis of impossibly large problem domains and
rigid program-by-contract regimes. Ruby has unique characteristics,
among all other languages that I know, that qualify it for a first
approach to my prticular dream. Among these are the excellent
metaprogramming support, the open classes, the adaptability to tooling,
and (yes) the generally-acceptable performance.

If one's goal is to get a program that will take the least amount of
time to plow through some vector mathematics problem, then by all means
let's have the language-performance discussion. But to me, most of these
compute-intensive tasks are problems that have been being addressed by
smart people ever since Fortran came along. We don't necessarily need
Ruby to solve them.

We do need Ruby to solve a very different set of next-generation
problems, for which C and Java (and even Perl and Python) are very
poorly suited.
Cb48ca5059faf7409a5ab3745a964696?d=identicon&s=25 unknown (Guest)
on 2006-07-26 23:25
(Received via mailing list)
On Thu, 27 Jul 2006, Ashley Moran wrote:

> rewrite it so that most files are output a line at a time, and call out to
> datasets into the routine (we're using two out of a conceivable 4 or 5, I
> think).  In that case we could justify buying a faster CPU if it got out of
> hand, rather than rewrite it in C.  But that's more a reflection of hardware
> prices than my wages :)
>
> I have yet to write anything in Ruby was less than twice as fast to code as
> it would have been in bourne-sh/Java/whatever, never mind twice as fun or
> maintainable.  I recently rewrote an 830 line Java/Hibernate web service
> client as 67 lines of Ruby, in about an hour.  With that kind of
> productivity, performance can go to hell!

i process tons of big csv files and use this approach:

   - parse the first line, remember cell count

   - foreach line
     - attempt parsing using simple split, iff that fails fall back to
csv.rb
       methods


something like

   n_fields = nil

   f.each do |line|
     fields = lines.split %r/,/
     n_fields ||= fields.size

     if fields.size != n_fields
       fields = parse_with_csv_lib line
     end

     ...
   end

this obviously won't work with csv files that have cells spanning lines,
but
for simply stuff it can speed up parsing in a huge way.

-a
F1d37642fdaa1662ff46e4c65731e9ab?d=identicon&s=25 Charles O Nutter (Guest)
on 2006-07-26 23:29
(Received via mailing list)
On 7/26/06, Chad Perrin <perrin@apotheon.com> wrote:
> which speeds up the interpretation process.
Half true. The Java VM could be called "half-compiled and
half-interpreted"
at runtime for only a short time, and only if you do not consider VM
bytecodes to be a valid "compiled" state. However most bytecode is very
quickly compiled into processor-native code, making those bits fully
compiled. After a long enough runtime (not very long in actuality), all
Java
code is running as native code for the target processor (with various
degrees of optimization and overhead).

The difference between AOT compilation with GCC or .NET is that Java's
compiler can make determinations based on runtime profiling about *how*
to
compile that "last mile" in the most optimal way possible. The bytecode
compilation does, as you say, primarily speed up the interpretation
process.
However it's far from the whole story, and the runtime JITing of
bytecode
into native code is where the magic lives. To miss that is to miss the
greatest single feature of the JVM.

When execution happens in Perl 5.x, on the other hand, a compiler runs
> at execution time, compiling executable binary code from the source.  It
> does so in stages, however, to allow for the dynamic runtime effects of
> Perl to take place -- which is one reason the JIT compiler is generally
> preferable to a compiler of persistent binary executables in the style
> of C.  Perl is, thus, technically a compiled language, and not an
> interpreted language like Ruby.


I am not familiar with Perl's compiler. Does it compile to
processor-native
code or to an intermediate bytecode of some kind?

We're also juggling terms pretty loosely here. A compiler converts
human-readable code into machine-readable code. If the "machine" is a
VM,
then you're fully compiling. If the VM code later gets compiled into
"real
machine" code, that's another compile cycle. Compilation isn't as cut
and
dried as you make it out to be, and claiming that, for example, Java is
"half compiled" is just plain wrong.

Something akin to bytecode compilation could be used to improve upon the
> execution speed of Perl programs without diverging from the
> JIT-compilation execution it currently uses and also without giving up
> any of the dynamic runtime capabilities of Perl.  This would involve
> running the first (couple of) pass(es) of the compiler to produce a
> persistent binary compiled file with the dynamic elements still left in
> an uncompiled form, to be JIT-compiled at execution time.  That would
> probably grant the best performance available for a dynamic language,
> and would avoid the overhead of a VM implementation.  It would, however,
> require some pretty clever programmers to implement in a sane fashion.


There are a lot of clever programmers out there.

I'm not entirely certain that would be appropriate for Ruby, considering
> how much of the language ends up being dynamic in implementation, but it
> bothers me that it doesn't even seem to be up for discussion.  In fact,
> Perl is heading in the direction of a VM implementation with Perl 6,
> despite the performance successes of the Perl 5.x compiler.  Rather than
> improve upon an implementation that is working brilliantly, they seem
> intent upon tossing it out and creating a different implementation
> altogether that, as far as I can see, doesn't hold out much hope for
> improvement.  I could, of course, be wrong about that, but that's how it
> looks from where I'm standing.


Having worked heavily on a Ruby implementation, I can say for certain
that
99% of Ruby code is static. There are some dynamic bits, especially
within
Rails where methods are juggled about like flaming swords, but even
these
dynamic bits eventually settle into mostly-static sections of code.
Compilation of Ruby code into either bytecode for a fast interpreter
engine
like YARV or into bytecode for a VM like Java is therefore perfectly
valid
and very effective. Preliminary compiler results for JRuby show a boost
of
50% performance over previous versions, and that's without optimizing
many
of the more expensive Ruby operations (call logic, block management).
Whether a VM is present (as in JRuby) or not (as may be the case with
YARV),
eliminating the overhead of per-node interpretation is a big positive.
JRuby
will also feature a JIT compiler to allow running arbitrary .rb files
directly, optimizing them as necessary and as seems valid based on
runtime
characteristics. I don't know if YARV will do the same, but it's a good
idea.

It just looks to me like everyone's chasing VMs.  While the nontrivial
> problems with Java's VM are in many cases specific to the Java VM (the
> Smalltalk VMs have tended to be rather better designed, for instance),
> there are still issues inherent in the VM approach as currently
> envisioned, and as such it leaves sort of a bad taste in my mouth.


The whole VM thing is such a small issue. Ruby itself is really just a
VM,
where its instructions are the elements in its AST. The definition of a
VM
is sufficiently vague enough to include most other interpreters in the
same
family. Perhaps you are specifically referring to VMs that provide a set
of
"processor-like" fine-grained operations, attempting to simulate some
sort
of magical imaginary hardware? That would describe the Java VM pretty
well,
though in actuality there are real processes that run Java bytecodes
natively as well. Whether or not a language runs on top of a VM is
irrelevant, especially considering JRuby is a mostly-compatible version
of
Ruby running on top of a VM. It matters much more that translation to
whatever underlying machine....virtual or otherwise...is as optimal and
clean as possible.
A90204c955db033cd975f7bb0ec9600b?d=identicon&s=25 Ashley Moran (Guest)
on 2006-07-26 23:35
(Received via mailing list)
On Jul 26, 2006, at 9:11 pm, Chad Perrin wrote:

> It just looks to me like everyone's chasing VMs.  While the nontrivial
> problems with Java's VM are in many cases specific to the Java VM (the
> Smalltalk VMs have tended to be rather better designed, for instance),
> there are still issues inherent in the VM approach as currently
> envisioned, and as such it leaves sort of a bad taste in my mouth.

Chad...

Just out of curiosity (since I don't know much about this subject),
what do yo think of the approach Microsoft took with the CLR?  From
what I read it's very similar to the JVM except it compiles directly
to native code, and makes linking to native libraries easier.  I
assume this is closer to JVM behaviour than Perl 5 behaviour.  Is
there anything to be learnt from it for Ruby?

Ashley
Fd22ee3cfc7dac283ce8e451af324f7d?d=identicon&s=25 Chad Perrin (Guest)
on 2006-07-27 00:16
(Received via mailing list)
On Thu, Jul 27, 2006 at 06:16:25AM +0900, Ashley Moran wrote:
>
> I have yet to write anything in Ruby was less than twice as fast to
> code as it would have been in bourne-sh/Java/whatever, never mind
> twice as fun or maintainable.  I recently rewrote an 830 line Java/
> Hibernate web service client as 67 lines of Ruby, in about an hour.
> With that kind of productivity, performance can go to hell!

With a 92% cut in code weight, I can certainly sympathize with that
sentiment.  Wow.
Fd22ee3cfc7dac283ce8e451af324f7d?d=identicon&s=25 Chad Perrin (Guest)
on 2006-07-27 00:32
(Received via mailing list)
On Thu, Jul 27, 2006 at 06:24:49AM +0900, Charles O Nutter wrote:
> >however.  Thus, Java is generally half-compiled and half-interpreted,
> >which speeds up the interpretation process.
>
>
> Half true. The Java VM could be called "half-compiled and half-interpreted"
> at runtime for only a short time, and only if you do not consider VM
> bytecodes to be a valid "compiled" state. However most bytecode is very
> quickly compiled into processor-native code, making those bits fully
> compiled. After a long enough runtime (not very long in actuality), all Java
> code is running as native code for the target processor (with various
> degrees of optimization and overhead).

True . . . but this results in fairly abysmal performance, all things
considered, for short runs.  Also, see below regarding dynamic
programming.


>
> The difference between AOT compilation with GCC or .NET is that Java's
> compiler can make determinations based on runtime profiling about *how* to
> compile that "last mile" in the most optimal way possible. The bytecode
> compilation does, as you say, primarily speed up the interpretation process.
> However it's far from the whole story, and the runtime JITing of bytecode
> into native code is where the magic lives. To miss that is to miss the
> greatest single feature of the JVM.

This also is true, but that benefit is entirely unusable for highly
dynamic code, unfortunately -- and, in fact, even bytecode compilation
might be a bit too much to ask for too-dynamic code.  I suppose it's
something for pointier heads than mine, since I'm not actually a
compiler-writer or language-designer (yet).  It's also worth noting that
this isn't accomplishing anything that isn't also accomplished by the
Perl JIT compiler.


> code or to an intermediate bytecode of some kind?
There is no intermediate bytecode step for Perl, as far as I'm aware.
It's not a question I've directly asked one of the Perl internals
maintainers, but everything I know about the Perl compiler confirms my
belief that it simply does compilation to machine code.


>
> We're also juggling terms pretty loosely here. A compiler converts
> human-readable code into machine-readable code. If the "machine" is a VM,
> then you're fully compiling. If the VM code later gets compiled into "real
> machine" code, that's another compile cycle. Compilation isn't as cut and
> dried as you make it out to be, and claiming that, for example, Java is
> "half compiled" is just plain wrong.

Let's call it "virtually compiled", then, since it's being compiled to
code that is readable by a "virtual machine" -- or, better yet, we can
call it bytecode and say that it's not fully compiled to physical
machine-readable code, which is what I was trying to explain in the
first place.


> >require some pretty clever programmers to implement in a sane fashion.
>
> There are a lot of clever programmers out there.

True, of course.  The problem is getting them to work on a given
problem.


>
> Having worked heavily on a Ruby implementation, I can say for certain that
> 99% of Ruby code is static. There are some dynamic bits, especially within
> Rails where methods are juggled about like flaming swords, but even these
> dynamic bits eventually settle into mostly-static sections of code.

I love that imagery, with the flaming sword juggling.  Thanks.


> idea.
I'm sure a VM or similar approach (and, frankly, I do prefer the
fast-interpreter approach over the VM approach) would provide ample
opportunity to improve upon Ruby's current performance, but that doesn't
necessarily mean it's better than other approaches to improving
performance.  That's where I was aiming.


> Ruby running on top of a VM. It matters much more that translation to
> whatever underlying machine....virtual or otherwise...is as optimal and
> clean as possible.

A dividing line between "interpreter" and "VM" has always seemed rather
more clear to me than you make it sound.  Yes, I do refer to a
simulation of an "imaginary" (or, more to the point, "virtual") machine,
as opposed to a process that interprets code.  Oh, wait, there's that
really, really obvious dividing line I keep seeing.

The use (or lack) of a VM does indeed matter: it's an implementation
detail, and implementation details make a rather significant difference
in performance.  The ability of the parser to quickly execute what's fed
to it is important, as you indicate, but so too is the ability of the
parser to run quickly itself -- unless your program is actually compiled
to machine-native code for the hardware, in which case the lack of need
for the parser to execute at all at runtime is significant.
Fd22ee3cfc7dac283ce8e451af324f7d?d=identicon&s=25 Chad Perrin (Guest)
on 2006-07-27 00:42
(Received via mailing list)
On Thu, Jul 27, 2006 at 06:33:08AM +0900, Ashley Moran wrote:
>
> Just out of curiosity (since I don't know much about this subject),
> what do yo think of the approach Microsoft took with the CLR?  From
> what I read it's very similar to the JVM except it compiles directly
> to native code, and makes linking to native libraries easier.  I
> assume this is closer to JVM behaviour than Perl 5 behaviour.  Is
> there anything to be learnt from it for Ruby?

I'm not as familiar with what's going on under the hood of the CLR as
the JVM, but from what I do know it exhibits both advantages and
disadvantages in comparison with the Java VM.  Thus far, the evidence
seems to be leaning in the direction of the CLR's advantages over the
JVM coming into play more often than the disadvantages, however, which
seems to indicate that the compromises that were made may have been the
"right" compromises, as far as this comparison goes.

In fact, the CLR seems in some ways to be a compromise between
Perl-style JIT compilation and Java-style bytecode compilation with
runtime VM-interpretation (there really needs to be a term for what a VM
does separate from either compilation or interpretation, since what it
does generally isn't strictly either of them).  There may well be
something to learn from that for future Ruby implementations, though I'd
warn away from trying to take the "all languages compile to the same
intermediate bytecode" approach that the CLR takes -- it tries to be too
many things at once, basically, and ends up introducing some
inefficiencies in that sense.  If you want to do everything CLR does,
with Ruby, then port Ruby to the CLR, but if you want to simply gain
performance benefits from studying up on the CLR, make sure you
cherry-pick the bits that are relevant to the task at hand.

I think Ruby would probably best benefit from something somewhere
between the Perl compiler's behavior and the CLR compiler.
Specifically, compile all the static algorithm behavior in your code to
something persistent, link in all the rest as uncompiled (though perhaps
parse-tree compiled, which is almost but not quite the same as bytecode
compiled) code, and let that be machine-code compiled at runtime.  This
might even conceivably be broken into two separate compilers to minimize
the last-stage compiler size needed on client systems and to optimize
each part to operate as quickly as possible.

Run all this stuff past a true expert before charging off to implement
it, of course.  I'm an armchair theorist.
F1d37642fdaa1662ff46e4c65731e9ab?d=identicon&s=25 Charles O Nutter (Guest)
on 2006-07-27 00:57
(Received via mailing list)
On 7/26/06, Francis Cianfrocca <garbagecat10@gmail.com> wrote:
>
> We do need Ruby to solve a very different set of next-generation
> problems, for which C and Java (and even Perl and Python) are very
> poorly suited.


I agree, Francis, and I'd add that exactly those areas where people seem
to
frequently have performance concerns are areas where Ruby's best
features
are practically ignored. Doing large-scale vector transformations does
not
require the unique ability to override Fixnum operations or treat
numbers as
objects, and so the benefits of those features are completely wasted
while
still bringing along their own baggage. Obviously as a JRuby developer
I'm
advocating using the right tool for the job, be it Java or Ruby or C. I
also
know that Ruby can do better, and I'm hoping we'll see improvements
sooner
rather than later.
0126d6f1de3c3309cabeba6cd5b033c9?d=identicon&s=25 Csaba Henk (Guest)
on 2006-07-27 01:28
(Received via mailing list)
On 2006-07-26, Sean O'Halpin <sean.ohalpin@gmail.com> wrote:
> implemented. For example, Ruby has real closures, Python doesn't. I

Even if OT, just for the sake of correctness: let me remark that Python
does have closures. Local functions (ones defined within another
function's body) are scoped lexically.

It's just sort of an anti-POLA (and inconvenient, as-is) piece of
semantics that variables get reinitalized upon assignment.

Hence:

def foo():
    x = 5
    def bar():
        x = 6
        return x
    bar()
    return x, bar

x, bar = foo()
print x, bar() ==> 5 6

def foo():
    _x = [5]
    def bar():
        _x[0] = 6
        return _x[0]
    bar()
    return _x[0], bar

x, bar = foo()
print x, bar()  ==> 6 6


Regards,
Csaba
Fd22ee3cfc7dac283ce8e451af324f7d?d=identicon&s=25 Chad Perrin (Guest)
on 2006-07-27 01:35
(Received via mailing list)
On Thu, Jul 27, 2006 at 08:25:11AM +0900, Csaba Henk wrote:
> Hence:
> print x, bar() ==> 5 6
> print x, bar()  ==> 6 6
>

Is it just me, or are there no proper closures in that example code?
E1d641bfe4071a5413bac781f06d3fd1?d=identicon&s=25 Sean O'halpin (sean)
on 2006-07-27 01:38
(Received via mailing list)
On 7/26/06, Chad Perrin <perrin@apotheon.com> wrote:
> On Thu, Jul 27, 2006 at 12:23:23AM +0900, Charles O Nutter wrote:
> >
> > You're mixing language semantics and implementation details here.
[snip]
> In some ways, you're right: implementation details are being mixed up
> with language definition in the preceding list of features.
[snip]

Nope - there's no mix up. My point is that any feature of a language
that requires extra work
whether it be at compile time or run time incurs a cost. Those
features are generally there to make life easier for us programmers,
not the machine. The only way to make sure you're not paying that
price is to hand-code optimised machine code for a specific processor
and hardware context. No language translator can guarantee that it
will produce better code (regardless of the nonsense of the 'perfect
compiler').

Let me take two examples from my list. First, method lookup in OO
languages. There is no
way you can optimise this across the board to static jumps in a
language like Ruby or Smalltalk. There will always be the requirement
(imposed by the ~semantics~ of the language) to be able to find the
right method at runtime. This is part of the language design which
imposes constraints on the implementation that assembly languages (for
example) do not have to pay. There is a cost for abstraction (a cost
which I am willing to pay by the way). Of course, you can implement
virtual methods in assembly, but you don't ~have~ to. In Ruby there is
no choice. Everything is an object. (You can optimise most of it away,
but not all).

Second, closures:

Chad Perrin said:
> A little extra memory usage does not translate directly to performance loss.

Coming from a background where I had to make everything happen in 48K,
I have to disagree. And it's not always just 'a little extra memory
usage'. Careless use of closures can cripple an application. See the
problems the Rails team encountered.

Charles - you say that closures are explicit - I beg to differ. By
definition, they are surely implicit. Doesn't your argument that they
can be simulated by other means contradict your statement?

As for the notion that a hardware YARV processor would make a
difference - how would that ameliorate the issues Ruby has with memory
usage? Performance isn't just about time - space also matters.

I am surprised that you think I am confusing language features with
implementation details. From my point of view, it is you who are
ignoring the fact that abstractions incur a cost.

Best regards (I'm enjoying this discussion BTW :)
Sean
E1d641bfe4071a5413bac781f06d3fd1?d=identicon&s=25 Sean O'halpin (sean)
on 2006-07-27 01:44
(Received via mailing list)
On 7/27/06, Chad Perrin <perrin@apotheon.com> wrote:
> >
> > x, bar = foo()
> > x, bar = foo()
> > print x, bar()  ==> 6 6
> >
>
> Is it just me, or are there no proper closures in that example code?
>

I've crossed my eyes twice and still can't see it ;)
Fd22ee3cfc7dac283ce8e451af324f7d?d=identicon&s=25 Chad Perrin (Guest)
on 2006-07-27 01:51
(Received via mailing list)
On Thu, Jul 27, 2006 at 08:35:47AM +0900, Sean O'Halpin wrote:
>
> Chad Perrin said:
> >A little extra memory usage does not translate directly to performance
> >loss.
>
> Coming from a background where I had to make everything happen in 48K,
> I have to disagree. And it's not always just 'a little extra memory
> usage'. Careless use of closures can cripple an application. See the
> problems the Rails team encountered.

Careless use of ANYTHING can cripple an application.  Using an extra
kilobyte of memory on a 1GB system for a closure instance or two is not
indicative of an inescapable performance cost for the mere fact of the
existence of closures.  While your point about 48K of RAM is well-taken,
it's pretty much inapplicable here: I wouldn't be writing Ruby programs
to run in 48K.  Hell, my operating system won't run in 48K, nor even a
hundred times that (I'm using Debian GNU/Linux Etch/Testing, if that
matters).  I'm sure as heck not going to expect all my applications to
run in that environment.

Careless use of pointers can cripple not only the application, but the
whole system.  Careless use of loops can crash it.  Careless use of the
power cord can destroy the hardware.  In the grand scheme of things,
closures are not a good example of a language feature that hinders
performance when we're talking about high-level languages such as Ruby.
Fd22ee3cfc7dac283ce8e451af324f7d?d=identicon&s=25 Chad Perrin (Guest)
on 2006-07-27 01:54
(Received via mailing list)
On Thu, Jul 27, 2006 at 08:42:11AM +0900, Sean O'Halpin wrote:
> >
> >Is it just me, or are there no proper closures in that example code?
>
> I've crossed my eyes twice and still can't see it ;)

Remember what your mother said: if you keep doing that, your eyes might
stick that way.  Be careful.
E698f564ac90c4c248f1f678caafd624?d=identicon&s=25 Keith Gaughan (Guest)
on 2006-07-27 01:54
(Received via mailing list)
On Thu, Jul 27, 2006 at 08:33:31AM +0900, Chad Perrin wrote:
> >
> > x, bar = foo()
> > x, bar = foo()
> > print x, bar()  ==> 6 6
> >
>
> Is it just me, or are there no proper closures in that example code?

No, Chad. There's closures in there. What you're not seeing is
anonymous functions, but closures are not the same as anonymous
functions.
Fd22ee3cfc7dac283ce8e451af324f7d?d=identicon&s=25 Chad Perrin (Guest)
on 2006-07-27 02:06
(Received via mailing list)
On Thu, Jul 27, 2006 at 08:53:18AM +0900, Keith Gaughan wrote:
> On Thu, Jul 27, 2006 at 08:33:31AM +0900, Chad Perrin wrote:
> >
> > Is it just me, or are there no proper closures in that example code?
>
> No, Chad. There's closures in there. What you're not seeing is
> anonymous functions, but closures are not the same as anonymous
> functions.

Maybe I'm missing something critical about Python, but I don't see the
persistent code construct being passed from the function when its
lexical scope (assuming it's truly lexical, which it might not be in
this case) closes.  It's only a closure if there's something persistent
that was "closed" by the scope closing.
E698f564ac90c4c248f1f678caafd624?d=identicon&s=25 Keith Gaughan (Guest)
on 2006-07-27 02:12
(Received via mailing list)
On Thu, Jul 27, 2006 at 08:53:18AM +0900, Keith Gaughan wrote:

> On Thu, Jul 27, 2006 at 08:33:31AM +0900, Chad Perrin wrote:
>
> > Is it just me, or are there no proper closures in that example code?
>
> No, Chad. There's closures in there. What you're not seeing is
> anonymous functions, but closures are not the same as anonymous
> functions.

Clarification: like Java, Python won't let you assign to variables in
the outer scope, so you have to use an array or some other object to
hack around that if you need that functionality. I know, it sucks, but
the fact it doesn't allow you to assign to an outer scope doesn't stop
Python from having closures, just that it doesn't trust the developer
not to screw things up.

Here's a better example:

def foo():
    x = [0]
    def bar():
        x[0] += 1
        print x[0]
    return bar

baz = foo()
baz()           -> 1
baz()           -> 2
baz()           -> 3

Of course, this is better implemented as a generator.
8b7494f383107ad4673ce5ac82d65876?d=identicon&s=25 Thomas E Enebo (Guest)
on 2006-07-27 02:16
(Received via mailing list)
On Thu, 27 Jul 2006, Chad Perrin defenestrated me:
> >
> > I am not familiar with Perl's compiler. Does it compile to processor-native
> > code or to an intermediate bytecode of some kind?
>
> There is no intermediate bytecode step for Perl, as far as I'm aware.
> It's not a question I've directly asked one of the Perl internals
> maintainers, but everything I know about the Perl compiler confirms my
> belief that it simply does compilation to machine code.

  I have not been on the perl train for years, but I believe for Perl5
at least this is not true.  I remember Malcolm Beatties B module which
basically exposed the intermediate bytecodes that perl normally
interprets.
That was some time ago and things may have changed?

  Here is some documentation on this (it could be old but it seems to
match my memory):

http://www.faqs.org/docs/perl5int/c163.html

  So it looks like Perl is somewhat similiar to Java (perhaps the other
way around since Perl's interpreter pre-dates Java).  An analogy of the
difference would be that Perl is CISC and Java is RISC since Perl
bytecode
is higher level.  Maybe they JIT pieces?

-Tom
8d16869783573d7ca80a676b65cf98e7?d=identicon&s=25 David Pollak (Guest)
on 2006-07-27 02:28
(Received via mailing list)
On 7/26/06, Chad Perrin <perrin@apotheon.com> wrote:
> On Thu, Jul 27, 2006 at 12:42:46AM +0900, David Pollak wrote:
>
>
> >
> > There are some applications that will never perform as in Java (e.g.,
> > stuff that's heavily oriented to bit manipulation.)  But for many
> > classes of applications (e.g., spreadsheets) Java can perform as well
> > as C.
>
> Is that heavily optimized Java vs. "normal" (untweaked) C?

No.  That's heavily optimized Java vs. heavily optimized C.  I spent a
fair amount of time chatting with the Excel team a while back.  They
cared as much about performance as I did.  They spent a lot more time
and money optimizing Excel than I did with Integer.  They had far more
in terms of tools and access to compiler tools than I did (although
Sun was very helpful to me.)

What was at stake was not someone's desktop spreadsheet, but was the
financial trader's desk.  Financial traders move millions (and
sometimes billions) of Dollars, Euros, etc. through their spreadsheets
every day.  A 5 or 10 second advantage in calculating a spreadsheet
could mean a significant profit for a trading firm.

So, I am comparing apples to apples.  A Java program can be optimized
to perform as well as a C program for *certain* tasks.
Fd22ee3cfc7dac283ce8e451af324f7d?d=identicon&s=25 Chad Perrin (Guest)
on 2006-07-27 03:09
(Received via mailing list)
On Thu, Jul 27, 2006 at 09:26:45AM +0900, David Pollak wrote:
> >Is that heavily optimized Java vs. "normal" (untweaked) C?
>
> No.  That's heavily optimized Java vs. heavily optimized C.  I spent a
> fair amount of time chatting with the Excel team a while back.  They
> cared as much about performance as I did.  They spent a lot more time
> and money optimizing Excel than I did with Integer.  They had far more
> in terms of tools and access to compiler tools than I did (although
> Sun was very helpful to me.)

Excel isn't a very good point of comparison for C.  For one thing, it's
not C -- it's C++.  For another, it has a pretty bad case of featuritis.
Fd22ee3cfc7dac283ce8e451af324f7d?d=identicon&s=25 Chad Perrin (Guest)
on 2006-07-27 03:24
(Received via mailing list)
On Thu, Jul 27, 2006 at 09:14:49AM +0900, Thomas E Enebo wrote:
> That was some time ago and things may have changed?
>
>   Here is some documentation on this (it could be old but it seems to
> match my memory):
>
> http://www.faqs.org/docs/perl5int/c163.html
>
>   So it looks like Perl is somewhat similiar to Java (perhaps the other
> way around since Perl's interpreter pre-dates Java).  An analogy of the
> difference would be that Perl is CISC and Java is RISC since Perl bytecode
> is higher level.  Maybe they JIT pieces?

I believe you are correct, with regard to an intermediate code step,
after all.  I've done some research on it to refresh my memory.  Whether
it continues to compile to a machine-readable executable or interprets
the intermediate code form is something I haven't been able to nail down
yet.  I'll keep looking.  Apparently, I was wrong somewhere along the
way.
E698f564ac90c4c248f1f678caafd624?d=identicon&s=25 Keith Gaughan (Guest)
on 2006-07-27 03:37
(Received via mailing list)
On Thu, Jul 27, 2006 at 10:07:00AM +0900, Chad Perrin wrote:

> > >
> not C -- it's C++.  For another, it has a pretty bad case of featuritis.
Actually, the part that counts, calculation engine, comes in two
varieties: a slow but provably correct version, and a fast, highly
optimised version, a significant portion of which is written _assembly
language_. MS use a battery of regression tests to ensure that the fast
one always gives the same results as the slow one.

Just because the bits that aren't performance sensitive are written in
C++ doesn't mean that the rest of it is slow and bloated.

K.
Fd22ee3cfc7dac283ce8e451af324f7d?d=identicon&s=25 Chad Perrin (Guest)
on 2006-07-27 03:52
(Received via mailing list)
On Thu, Jul 27, 2006 at 10:35:52AM +0900, Keith Gaughan wrote:
>
> Actually, the part that counts, calculation engine, comes in two
> varieties: a slow but provably correct version, and a fast, highly
> optimised version, a significant portion of which is written _assembly
> language_. MS use a battery of regression tests to ensure that the fast
> one always gives the same results as the slow one.

That might be the "part that counts" (nice pun) for calculation, but
it's not the only part that counts.  Interface rendering, interactive
operations, and so on are also fairly important performance-wise -- at
least to the user.  In fact, calculation waits can be easier to overlook
as a user than waits for the application to catch up when you click on a
button.

On the other hand, if we were specifically referring to things like
column calculation speed (of which I wasn't strictly aware), then your
point is made.
E698f564ac90c4c248f1f678caafd624?d=identicon&s=25 Keith Gaughan (Guest)
on 2006-07-27 04:20
(Received via mailing list)
On Thu, Jul 27, 2006 at 10:50:32AM +0900, Chad Perrin wrote:

> On Thu, Jul 27, 2006 at 10:35:52AM +0900, Keith Gaughan wrote:
>
> > Actually, the part that counts, calculation engine, comes in two
> > varieties: a slow but provably correct version, and a fast, highly
> > optimised version, a significant portion of which is written _assembly
> > language_. MS use a battery of regression tests to ensure that the fast
> > one always gives the same results as the slow one.

> That might be the "part that counts" (nice pun) for calculation, but
> it's not the only part that counts.

As far as Excel goes, it is. It's the single biggest time sink in the
application.

> Interface rendering, interactive
> operations, and so on are also fairly important performance-wise

...most of which is down to Windows itself, not Excel. Excel's
contribution to that lag isn't, I believe, all that great. So in this
regard, your complaint is more to do with GDI and so on than with Excel
itself.

> least to the user.  In fact, calculation waits can be easier to overlook
> as a user than waits for the application to catch up when you click on a
> button.

Two point:

 1. As far as I know, Excel runs its interface on one thread and the
    calculation engine on another. This helps give the apprearance of
    Excel being snappier than it actually is: you're able to work on the
    spreadsheet while it's recalculating cells.

 2. On simple spreadsheets, the lag isn't noticible. But Excel is
    designed to be able to handle big spreadsheets well. That's why so
    much work is put into the calculation engine rather than having an
    interface which is completely fat free: in time critical
applications,
    it's the calculation engine that really matters.

I use Excel a lot, and have for a few years now. Grudgingly, mind you,
because I dislike having to deal with spreadsheets. But as far as MS
applications go, I think your accusations of slowness and bloat are a
little off the mark and better targeted towards its fellow MS Office
software.

Where Excel *does* fall down in turns of speed is disc I/O. There it can
be atrociously slow.

> On the other hand, if we were specifically referring to things like
> column calculation speed (of which I wasn't strictly aware), then your
> point is made.

Recalculating a spreadsheet is something more that just calculating
columns. Excel itself is a Turing-complete dataflow machine. Getting
something like that which is both correct *and* fast is hard.

K.
Fd22ee3cfc7dac283ce8e451af324f7d?d=identicon&s=25 Chad Perrin (Guest)
on 2006-07-27 04:29
(Received via mailing list)
On Thu, Jul 27, 2006 at 09:09:43AM +0900, Keith Gaughan wrote:
> Clarification: like Java, Python won't let you assign to variables in
>     def bar():
>         x[0] += 1
>         print x[0]
>     return bar
>
> baz = foo()
> baz()           -> 1
> baz()           -> 2
> baz()           -> 3

That's a bit clearer -- and it does look like a proper closure.  It also
looks unnecessarily complex to implement.  Thanks for the example.
Fd22ee3cfc7dac283ce8e451af324f7d?d=identicon&s=25 Chad Perrin (Guest)
on 2006-07-27 04:38
(Received via mailing list)
On Thu, Jul 27, 2006 at 11:17:44AM +0900, Keith Gaughan wrote:
> On Thu, Jul 27, 2006 at 10:50:32AM +0900, Chad Perrin wrote:
>
> > That might be the "part that counts" (nice pun) for calculation, but
> > it's not the only part that counts.
>
> As far as Excel goes, it is. It's the single biggest time sink in the
> application.

. . .
I'll put it this way: it's not the only part that counts for me, and for
other spreadsheet users with whom I've discussed Excel in the past.


>
> > Interface rendering, interactive
> > operations, and so on are also fairly important performance-wise
>
> ...most of which is down to Windows itself, not Excel. Excel's
> contribution to that lag isn't, I believe, all that great. So in this
> regard, your complaint is more to do with GDI and so on than with Excel
> itself.

Excel doesn't run so well on Linux, so I'll just leave that lying where
it is.


>     spreadsheet while it's recalculating cells.
>
>  2. On simple spreadsheets, the lag isn't noticible. But Excel is
>     designed to be able to handle big spreadsheets well. That's why so
>     much work is put into the calculation engine rather than having an
>     interface which is completely fat free: in time critical applications,
>     it's the calculation engine that really matters.

. . . and yet, the interface being fat-free would be awfully nice.
Instead, it gets fatter with every new version.


>
> I use Excel a lot, and have for a few years now. Grudgingly, mind you,
> because I dislike having to deal with spreadsheets. But as far as MS
> applications go, I think your accusations of slowness and bloat are a
> little off the mark and better targeted towards its fellow MS Office
> software.

It's true that other MS Office applications are worse.  That doesn't
make Excel perfect.


>
> Where Excel *does* fall down in turns of speed is disc I/O. There it can
> be atrociously slow.

I certainly won't disagree with that.  Disk access seems to be another
problem -- though it's easier to overlook than some other issues, once a
spreadsheet is loaded and before it needs to be saved again.


>
> > On the other hand, if we were specifically referring to things like
> > column calculation speed (of which I wasn't strictly aware), then your
> > point is made.
>
> Recalculating a spreadsheet is something more that just calculating
> columns. Excel itself is a Turing-complete dataflow machine. Getting
> something like that which is both correct *and* fast is hard.

I don't particularly see how that contradicts what I said.  I may have
been more flippant in my reference to calculations than you'd like, but
I didn't say anything inaccurate.
E698f564ac90c4c248f1f678caafd624?d=identicon&s=25 Keith Gaughan (Guest)
on 2006-07-27 05:20
(Received via mailing list)
On Thu, Jul 27, 2006 at 11:38:34AM +0900, Chad Perrin wrote:

> > ...most of which is down to Windows itself, not Excel. Excel's
> > contribution to that lag isn't, I believe, all that great. So in this
> > regard, your complaint is more to do with GDI and so on than with Excel
> > itself.
>
> Excel doesn't run so well on Linux, so I'll just leave that lying where
> it is.

In fairness, if you're judging it's performance based on running it in
Wine, that's not really a fair comparison. :-)

K.
E698f564ac90c4c248f1f678caafd624?d=identicon&s=25 Keith Gaughan (Guest)
on 2006-07-27 05:27
(Received via mailing list)
On Thu, Jul 27, 2006 at 11:28:20AM +0900, Chad Perrin wrote:

> > baz()           -> 1
> > baz()           -> 2
> > baz()           -> 3
>
> That's a bit clearer -- and it does look like a proper closure.  It also
> looks unnecessarily complex to implement.  Thanks for the example.

In practice, it's not really all that bad. Most of the places where I'd
end up using closures in, say Ruby and JavaScript, I'd end up using
generators, list comprehensions, &c. in Python. Having to name the inner
function's a bit of a pain, but generally I don't end up assigning to
the variables in the outer scope anyway, so that's not such a big deal
either.

Different stroke, and all that.

K.
481b8eedcc884289756246e12d1869c1?d=identicon&s=25 Francis Cianfrocca (blackhedd)
on 2006-07-27 06:23
Ashley Moran wrote:
 > I think the total data size is about 1.5GB, but the individual files
> are smaller, the largest being a few hundred GB.  The most rows in a
> file is ~15,000,000 I think.  The server I run it on has 2GB RAM (an
> Athlon 3500+ running FreeBSD/amd64, so the hardware is not really an
> issue)... it can get all the way through without swapping (just!)

The problem isn't the raw size of the dataset. It's the number of
objects you create and the amount of garbage that has to be cleaned up.
If you're clever about how you write, you can help Ruby by not creating
so much garbage. That can give a huge benefit.


>
> The processing is pretty trivial, and mainly involves incrementing
> some ID columns so we can merge datasets together, adding a text
> column to the start of every row, and eliminating a few duplicates.

Eliminating the dupes is the only scary thing I've seen here. What's the
absolute smallest piece of data that you need to look at in order to
distinguish a dupe? (If it's the whole line, then the answer is 16
bytes- the length of the MD5 hash ;-)) That's the critical working set.
If you can't get the Ruby version fast enough, it's cheap and easy to
sort through 15,000,000 of them in C. Then one pass through the sorted
set finds your dupes. I've never found a consistently-fastest performer
among Ruby's several different ways of storing sorted sets.

Make sure that your inner loop doesn't allocate any new variables,
especially arrays- declare them outside your inner loop and re-use them
with Array#clear.

> doesn't improve things, there's always the option of going dual-core
> and forking to do independent files.

Obviously I haven't seen your code or your data, but if the Ruby app is
memory-bus-bound, then this approach may make your problem worse, not
better.

Good luck. I recently got a Ruby program that aggregates several LDAP
directory-pulls with about a million entries down from a few hours to a
few seconds, without having to drop into C. It can be done, and it's
kindof fun too.
Fd22ee3cfc7dac283ce8e451af324f7d?d=identicon&s=25 Chad Perrin (Guest)
on 2006-07-27 07:01
(Received via mailing list)
On Thu, Jul 27, 2006 at 12:19:24PM +0900, Keith Gaughan wrote:
> In fairness, if you're judging it's performance based on running it in
> Wine, that's not really a fair comparison. :-)

I'm judging it based on running it on Windows.  My point is that
divorcing it from the only environment in which it runs (natively) is
less than strictly sporting of you, when trying to discuss its
performance characteristics (or lack thereof).
2ee1a7960cc761a6e92efb5000c0f2c9?d=identicon&s=25 William James (Guest)
on 2006-07-27 07:23
(Received via mailing list)
Kroeger, Simon (ext) wrote:
>
> > real    473m45.370s
> >
> $size = (ARGV.shift || 5).to_i
> $latins = []
> $latins.each do |latin|
>   $perms.each do |perm|
>     perm.each{|p| puts $out[latin[p]]}
>     puts
>   end
> end
> -----------------------------------------------------------
> (does someone has a nicer/even faster version?)

Here's a much slower version that has no 'require'.

Wd = ARGV.pop.to_i
$board = []

# Generate all possible valid rows.
Rows = (0...Wd**Wd).map{|n| n.to_s(Wd).rjust(Wd,'0')}.
  reject{|s| s=~/(.).*\1/}.map{|s| s.split(//).map{|n| n.to_i + 1} }

def check( ary, n )
  ary[0,n+1].transpose.all?{|x| x.size == x.uniq.size }
end

def add_a_row( row_num )
  if Wd == row_num
    puts $board.map{|row| row.join}.join(':')
  else
    Rows.size.times { |i|
      $board[row_num] = Rows[i]
      if check( $board, row_num )
        add_a_row( row_num + 1 )
      end
    }
  end
end

add_a_row 0
E698f564ac90c4c248f1f678caafd624?d=identicon&s=25 Keith Gaughan (Guest)
on 2006-07-27 07:29
(Received via mailing list)
On Thu, Jul 27, 2006 at 01:59:37PM +0900, Chad Perrin wrote:
> >
> > In fairness, if you're judging it's performance based on running it in
> > Wine, that's not really a fair comparison. :-)
>
> I'm judging it based on running it on Windows.  My point is that
> divorcing it from the only environment in which it runs (natively) is
> less than strictly sporting of you, when trying to discuss its
> performance characteristics (or lack thereof).

Wait... I did no such thing. All I said was that what interface
sluggishness you get from Excel can't be blamed on Excel. They're
performance characteristics that *can* be divorced from Excel (because
they're Window's own performance characteristic, not Excel's). Argue
those points, and you're arguing about the wrong software.

But Wine is an emulator, and while it does a good job approaching the
speed of Windows, it doesn't hit it, nor can it hit it. You're not
comparing like with like. Now that's far from sporting.

You're argument it disingenuous. Consider Cygwin running on Windows
compared to FreeBSD running on the same machine. I can make this
comparison because the machine I'm currently using dual-boots such a
setup. I run many of the same applications under Cygwin as I do under
FreeBSD on the same box. Those same applications running under Cygwin
are noticably slower than the native equivalents under FreeBSD. Do I
blame the software I'm running under Cygwin for being slow? No, because
I'm well aware that it zips along in its native environment. Do a blame
Cygwin? No, because it does an awful lot of work to trick the software
running under it that it's running on a *nix. Do I blame Windows? No,
because I use some of that software--gcc being an example--natively
under Windows and it performs just as well as when it's ran natively
under FreeBSD. Bringing Wine in is a red herring. Software cannot be
blamed for the environment it's executed in.

K.
Fd22ee3cfc7dac283ce8e451af324f7d?d=identicon&s=25 Chad Perrin (Guest)
on 2006-07-27 07:45
(Received via mailing list)
On Thu, Jul 27, 2006 at 02:26:37PM +0900, Keith Gaughan wrote:
> they're Window's own performance characteristic, not Excel's). Argue
> those points, and you're arguing about the wrong software.

Design decisions that involve interfacing with interface software that
sucks is related to the software under discussion -- and not all of the
interface is entirely delegated to Windows, either.  No software can be
evaluated for its performance characteristics separate from its
environment except insofar as it runs without that environment.


>
> But Wine is an emulator, and while it does a good job approaching the
> speed of Windows, it doesn't hit it, nor can it hit it. You're not
> comparing like with like. Now that's far from sporting.

Actually, no, it's not an emulator.  It's a set of libraries (or a
single library -- I'm a little sketchy on the details) that provides the
same API as Windows software finds in a Windows environment.  An
emulator actually creates a faux/copy version of the environment it's
emulating.  It is to Linux compared with Unix as an actual emulator is
to Cygwin compared with Unix: one is a differing implementation and the
other is an emulator.

. . . and, in fact, there are things that run faster via Wine on Linux
than natively on Windows.


[ snip ]
> under FreeBSD. Bringing Wine in is a red herring. Software cannot be
> blamed for the environment it's executed in.

I didn't bring it up.  You did.  I made a comment about Excel not
working in Linux as a bit of a joke, attempting to make the point that
saying Excel performance can be evaluated separately from its dependence
on Windows doesn't strike me as useful.
6d9bf78ca49a017e9e3e6b0357b6c59e?d=identicon&s=25 Peter Hickman (Guest)
on 2006-07-27 09:56
(Received via mailing list)
On my machine it took around 33 seconds but I think that I can improve
it a little, besides I have to test the results first.
A90204c955db033cd975f7bb0ec9600b?d=identicon&s=25 Ashley Moran (Guest)
on 2006-07-27 11:15
(Received via mailing list)
On Thursday 27 July 2006 05:23, Francis Cianfrocca wrote:
> especially arrays- declare them outside your inner loop and re-use them
> with Array#clear.

Nice MD5 trick!  I'll remember that.  Fortunately the files that need
duplicate elimination are really small, so I won't need to resort to
that.
But I'll remember it for future reference.


>
> Obviously I haven't seen your code or your data, but if the Ruby app is
> memory-bus-bound, then this approach may make your problem worse, not
> better.

Hadn't thought of that, good point...


> Good luck. I recently got a Ruby program that aggregates several LDAP
> directory-pulls with about a million entries down from a few hours to a
> few seconds, without having to drop into C. It can be done, and it's
> kindof fun too.

Next time I get a morning free I might apply some of the tweaks that
have been
suggested.  Might be interested to see how much  I can improve the
performance.

Cheers
Ashley
A90204c955db033cd975f7bb0ec9600b?d=identicon&s=25 Ashley Moran (Guest)
on 2006-07-27 11:22
(Received via mailing list)
On Wednesday 26 July 2006 23:13, Chad Perrin wrote:
> > I recently rewrote an 830 line Java/
> > Hibernate web service client as 67 lines of Ruby, in about an hour.  
> > With that kind of productivity, performance can go to hell!
>
> With a 92% cut in code weight, I can certainly sympathize with that
> sentiment.  Wow.

Even the last remaining member of the Anti-Ruby Defence League in my
office
admitted (reluctantly) that it was impressive!

Ashley
Cf6d0868b2b4c69bac3e6f265a32b6a7?d=identicon&s=25 Daniel Martin (Guest)
on 2006-07-27 17:07
(Received via mailing list)
harrisj@schizopolis.net writes:

> Are there
> good benchmarks for OO languages? Or dynamic languages? Are there good
> benchmarks that could actually measure the types of uses I need, where I'm
> building a web front end to a DB store? I don't know about you, but my job
> has never involved fractals.

There was posted here a few weeks ago, in a thread on rails
performance, a benchmark that measured how fast a "hello world" web
app could respond under heavy load.  This doesn't measure the DB back
end piece, of course, but it's a little closer to useful for you than
Mandelbrot calculations.

In fact, digging through Google desktop searches of my recently
visited web pages finds it here:

http://www.usenetbinaries.com/doc/Web_Platform_Ben...

Rails loses this contest big-time.  Perl CGI scripts even beat a Rails
FastCGI setup.  Rails FastCGI is about 15 times slower than plain ruby
FastCGI.

Also, it seems clear that at least for very simple web apps, PHP 4 to
PHP 5 is a distinct performance regression.
8979474815030ad4a5d59718d1905715?d=identicon&s=25 Isaac Gouy (Guest)
on 2006-07-27 22:48
(Received via mailing list)
Hal Fulton wrote:
> I won a bet once from a friend. We wrote comparable programs in
> Java and C++ (some arbitrary math in a loop running a bazillion
> times).
>
> With defaults on both compiles, the Java was actually *faster*
> than the C++. Even I didn't expect that. But as I said, this
> sort of thing is highly dependent on many different factors.
>
>
> Hal

Sometimes when we look at different workloads we can see the
performance crossover when relatively slow startup is overcome, code
JIT'd and adaptive optimisation kicks in

http://shootout.alioth.debian.org/gp4/fulldata.php...
87fe25bf0272d8ad886dda793bdcbbd9?d=identicon&s=25 Tim Bray (Guest)
on 2006-07-28 01:13
(Received via mailing list)
Sorry for coming late to the party.

On Jul 26, 2006, at 1:47 AM, Peter Hickman wrote:

> Whenever the question of performance comes up with scripting
> languages such as Ruby, Perl or Python there will be people whose
> response can be summarised as "Write it in C".

The conclusion is wrong in the general case.  Suppose that, instead
of computing permutations, your task had been to read ten million
lines of textual log files and track statistics about certain kinds
of events coded in there.  I bet a version coded in perl, making
straightforward uses of regexes and hashes, would have performance
that would be very hard to match in C or any other language.  Ruby
would be a little slower I bet just because Perl's regex engine is so
highly-tuned, although it's been claimed Oniguruma is faster.

So, first gripe: C is faster than Ruby *in certain problem domains*.
In others, it's not.

Second gripe.  The notion of doing a wholesale rewrite in C is almost
certainly wrong.  In fact, the notion of doing any kind of serious
hacking, without doing some measuring first, is almost always wrong.
The *right* way to build software that performs well is to write a
natural, idiomatic implementation, trying to avoid stupid design
errors but not worrying too much about performance.  If it's fast
enough, you're done.  If it's not fast enough, don't write another
line of code till you've used used a profiler and understand what the
problem is.  If in fact this is the kind of a problem where C is
going to do better, chances are you only have to replace 10% of your
code to get 90% of the available speedup.

And don't remember to budget downstream maintenance time for the
memory-allocation errors and libc dependencies and so on that cause C
programs to be subject to periodic downstream crashes.

  -Tim
4fea1ef11180adaaa299d503ca6010d0?d=identicon&s=25 John W. Kennedy (Guest)
on 2006-07-28 03:50
(Received via mailing list)
David Pollak wrote:
> There are some applications that will never perform as in Java (e.g.,
> stuff that's heavily oriented to bit manipulation.)  But for many
> classes of applications (e.g., spreadsheets) Java can perform as well
> as C.

Actually, it's notorious that a Sieve of Eratosthenes using the
respective Bitset classes is faster in Java than in C++.

I don't get it, either, but I've verified it myself.
4fea1ef11180adaaa299d503ca6010d0?d=identicon&s=25 John W. Kennedy (Guest)
on 2006-07-28 04:06
(Received via mailing list)
Chad Perrin wrote:
> The canonical example for comparison, I suppose, is the Java VM vs. the
> Perl JIT compiler.  In Java, the source is compiled to bytecode and
> stored.  In Perl, the source remains in source form, and is stored as
> ASCII (or whatever).  When execution happens with Java, the VM actually
> interprets the bytecode.  Java bytecode is compiled for a virtual
> computer system (the "virtual machine"), which then runs the code as
> though it were native binary compiled for this virtual machine.  That
> virtual machine is, from the perspective of the OS, an interpreter,
> however.  Thus, Java is generally half-compiled and half-interpreted,
> which speeds up the interpretation process.

Huh? Small-system (PDA, etc.) Java implementations may use bytecode
interpreters, but the mainstream ones started out using JIT, and were
later upgraded to start execution by interpreting, and then, if and when
it observes that a given segment is being repeatedly executed, compile
it to native code.
4fea1ef11180adaaa299d503ca6010d0?d=identicon&s=25 John W. Kennedy (Guest)
on 2006-07-28 04:32
(Received via mailing list)
Chad Perrin wrote:
> Ada, on the other hand -- for circumstances in which it is most commonly
> employed (embedded systems, et cetera), it does indeed tend to kick C's
> behind a bit.  That may have more to do with compiler optimization than
> language spec, though.

Language specs mean a good deal, actually. C semantics (except in C99,
/if/ the restrict keyword is consistently used by the coder and
seriously implemented by the optimizer) lead to unnecessarily slow
object code in many common cases. Ada's in-language support of tasking
also makes it easier for the compiler to know whether a segment of code
will or will not be multithreaded, which allows the compiler to optimize
more aggressively in many instances.

These are two of the reasons that so many C compilers include an
"optimize beyond safe limits" switch.

Indeed, many aspects of C's design cause optimization problems.
0-terminated strings were dandy on 8-bit processors, but the S/370
actually had to add some new opcodes just to make strcpy and strcmp
tolerably fast.

Ada has another advantage in that the language design strongly
encourages designing global optimization into the compiler, whereas the
C tradition of make files tends to discourage it.
Fd22ee3cfc7dac283ce8e451af324f7d?d=identicon&s=25 Chad Perrin (Guest)
on 2006-07-28 07:18
(Received via mailing list)
On Fri, Jul 28, 2006 at 11:05:11AM +0900, John W. Kennedy wrote:
> >which speeds up the interpretation process.
>
> Huh? Small-system (PDA, etc.) Java implementations may use bytecode
> interpreters, but the mainstream ones started out using JIT, and were
> later upgraded to start execution by interpreting, and then, if and when
> it observes that a given segment is being repeatedly executed, compile
> it to native code.

Wait . . . what?  When some Java applet (for example) is sent over an
HTTP connection to your computer to be executed cient-side, it is NOT
just source code.  Similarly, when you install a Java application, it
too is NOT simply copied onto the system in source code form.  It's
compiled to bytecode (or whatever the hell you want to call it) and
distributed thusly, for the JVM to run it.
Fd22ee3cfc7dac283ce8e451af324f7d?d=identicon&s=25 Chad Perrin (Guest)
on 2006-07-28 07:28
(Received via mailing list)
On Fri, Jul 28, 2006 at 11:30:05AM +0900, John W. Kennedy wrote:
>
> Language specs mean a good deal, actually.

I didn't say they didn't.  I said only that *in this case* it *may* be
that it's more a matter of compiler optimization than language spec.  I
don't know enough about Ada to be able to comment authoritatively on the
comparison, but I certainly do know that language design can have an
effect (via the requirements it imposes on the implementation).
F1d37642fdaa1662ff46e4c65731e9ab?d=identicon&s=25 Charles O Nutter (Guest)
on 2006-07-28 07:34
(Received via mailing list)
On 7/28/06, Chad Perrin <perrin@apotheon.com> wrote:
>
> Wait . . . what?  When some Java applet (for example) is sent over an
> HTTP connection to your computer to be executed cient-side, it is NOT
> just source code.  Similarly, when you install a Java application, it
> too is NOT simply copied onto the system in source code form.  It's
> compiled to bytecode (or whatever the hell you want to call it) and
> distributed thusly, for the JVM to run it.
>

Interpretation does not necessarily mean raw source code is being
processed.
Even interpreters parse raw source into a form they can understand.
Interpretation in the Java VM comes in the form of bytecode
interpretation,
so called because instead of the system CPU running native operations
it's
running another process that steps through the bytecodes. This is what's
typically called "interpreted mode" in the JVM. However every VM since
Java
1.3 has taken the next step at run time and compiled that bytecode into
native processor instructions, so that the interpreter is no longer
involved
for those compiled pieces.

Bytecode is what's distributed, yes, but it's little more than
pre-parsed
and lightly optimized source code. You can convert it back to source, if
you
like. Your definition of "interpreted" is too narrow.
Fd22ee3cfc7dac283ce8e451af324f7d?d=identicon&s=25 Chad Perrin (Guest)
on 2006-07-28 07:37
(Received via mailing list)
On Fri, Jul 28, 2006 at 02:32:20PM +0900, Charles O Nutter wrote:
> Interpretation does not necessarily mean raw source code is being processed.
> and lightly optimized source code. You can convert it back to source, if you
> like. Your definition of "interpreted" is too narrow.

How do you figure?  You just reiterated, in different words, everything
I said, then held it up as "proof" I'm "wrong".  I think you're
violently agreeing with me, or something, and don't realize it.
E698f564ac90c4c248f1f678caafd624?d=identicon&s=25 Keith Gaughan (Guest)
on 2006-07-28 07:54
(Received via mailing list)
On Fri, Jul 28, 2006 at 02:34:21PM +0900, Chad Perrin wrote:

> > >distributed thusly, for the JVM to run it.
> >
> > Bytecode is what's distributed, yes, but it's little more than pre-parsed
> > and lightly optimized source code. You can convert it back to source, if you
> > like. Your definition of "interpreted" is too narrow.
>
> How do you figure?  You just reiterated, in different words, everything
> I said, then held it up as "proof" I'm "wrong".  I think you're
> violently agreeing with me, or something, and don't realize it.

No, just that you left out the bit about JIT compilation into native
code.

K.
5a837592409354297424994e8d62f722?d=identicon&s=25 Ryan Davis (Guest)
on 2006-07-28 09:35
(Received via mailing list)
On Jul 26, 2006, at 2:23 AM, Pit Capitain wrote:

> Peter Hickman schrieb:
>> (Example of Perl and C Code)
>
> Peter, is there any chance you could test your program with Ruby
> Inline?
>
>   http://rubyforge.org/projects/rubyinline
>
> I'm on Windows, so I can't use Ruby Inline (+1 for MinGW btw :-)

Others have reported being able to use inline on windows... why can't
you?
Fd22ee3cfc7dac283ce8e451af324f7d?d=identicon&s=25 Chad Perrin (Guest)
on 2006-07-28 09:35
(Received via mailing list)
On Fri, Jul 28, 2006 at 02:53:09PM +0900, Keith Gaughan wrote:
> On Fri, Jul 28, 2006 at 02:34:21PM +0900, Chad Perrin wrote:
> >
> > How do you figure?  You just reiterated, in different words, everything
> > I said, then held it up as "proof" I'm "wrong".  I think you're
> > violently agreeing with me, or something, and don't realize it.
>
> No, just that you left out the bit about JIT compilation into native
> code.

Gee willickers, I'm sorry I didn't use the exact phrasing you wanted me
to.  Maybe next time, though, you won't claim the bits I said that
didn't actually have anything to do with your actual complaint were
wrong.

Y'know, screw it.  Be an ass if you like.  I'm done with this subthread.
6d9bf78ca49a017e9e3e6b0357b6c59e?d=identicon&s=25 Peter Hickman (Guest)
on 2006-07-28 10:38
(Received via mailing list)
Tim Bray wrote:
> So, first gripe: C is faster than Ruby *in certain problem domains*.
> In others, it's not.
>
The post was about people wanting better performance for their code.
Quite clearly if the code you have written in Ruby (or whatever) runs
fast enough for you then performance is a non issue. If however the
performance of your code is an issue then in truth there is only so much
improvement that you can squeeze out of Ruby, if that is enough to
resolve your performance issues then fine. If you want still more
performance then you want to write it in C (or perhaps buy some new
hardware :) )
> Second gripe.  The notion of doing a wholesale rewrite in C is almost
> certainly wrong.
An earlier project of mine used GD from Ruby to calculate some colour
metrics from images and write them into a database. I rewrote the whole
thing in C, using the same GD and SQLite2 libraries as the Ruby version,
and the improvement was massive. Despite the fact that the Ruby was not
actually doing very much. Most of the time was spent in the GD library,
so I am not all that convinced that rewritting part of a project in C
will achieve quite the same improvement. And if you are going to convert
a significant chunk of code to C then you may as well go the whole hog.
> In fact, the notion of doing any kind of serious hacking, without
> doing some measuring first, is almost always wrong.  The *right* way
> to build software that performs well is to write a natural, idiomatic
> implementation, trying to avoid stupid design errors but not worrying
> too much about performance.  If it's fast enough, you're done.
No problem here.
> If it's not fast enough, don't write another line of code till you've
> used used a profiler and understand what the problem is.  If in fact
> this is the kind of a problem where C is going to do better, chances
> are you only have to replace 10% of your code to get 90% of the
> available speedup.
Not been my experience to date but then perhaps I am not working on
problems that can be solved in that way.
93d566cc26b230c553c197c4cd8ac6e4?d=identicon&s=25 Pit Capitain (Guest)
on 2006-07-28 11:37
(Received via mailing list)
Ryan Davis schrieb:
> On Jul 26, 2006, at 2:23 AM, Pit Capitain wrote:
>>
>> I'm on Windows, so I can't use Ruby Inline (+1 for MinGW btw :-)
>
> Others have reported being able to use inline on windows... why can't you?

Ryan, googling for

   "ruby inline" windows

gave me no usable hint among the first 50 results besides using cygwin.
Do you have a link to the reports you mention?

Maybe I should have written that giving that I'm using the One Click
Installer, don't have the Windows compiler toolchain, and am not willing
to use cygwin, I can't use Ruby Inline. Is this better?

But, instead I might try to use Ruby Inline with MinGW, so thanks for
the question.

Regards,
Pit
F88f668d24d90731a6a3234fbfb12d1b?d=identicon&s=25 Csaba Henk (Guest)
on 2006-07-28 11:53
(Received via mailing list)
On 2006-07-26, Kristof Bastiaensen <kristof@vleeuwen.org> wrote:
> -- extend takes a list of columns, and extends each column with a
>         where x free
>
> latin_square n = latin_square_ n
>     where latin_square_ 0 = replicate n []  -- initalize columns to nil
>           latin_square_ m | m > 0 = extend (latin_square_ (m-1)) n
>
> square2str s = unlines $ map format_col s
>     where format_col col = unwords $ map show col
>
> main = mapIO_ (putStrLn . square2str) (findall (\s -> s =:= latin_square 5))
> ------------------------- end latin.curry -----------------------------

It's really nice and compact!
AFAIK Curry is Haskell boosted with logic programming.

I -- who, ATM, just watches these languages from a distance, and can't
tell it by looking at the code  -- wonder if have you used here
something specific to Curry, which would be harder/uglier to express in
Haskell?

And how the Curry compiler looks like? Is it just a hacked GHC? How
Curry performance relates to that of Haskell?

Regards,
Csaba
97550977337c9f0a0e1a9553e55bfaa0?d=identicon&s=25 Jan Svitok (Guest)
on 2006-07-28 12:14
(Received via mailing list)
Well, you need the compiler chain if you want to compile, that is what
inline does.

On windows, you have three options:

- MS - you can get by with their free compiler (VS Express or something)
- cygwin
- mingw

I have full VS, and inline worked for me when I started the program
with proper environment (=proper paths set), although I've tried only
the examples. And it seems VC6 and VC7 (VS2003) are better to use due
to the manifest stuff that VC8 (VS2005) creates.

I haven't tried cygwin or mingw.

J.
2ee1a7960cc761a6e92efb5000c0f2c9?d=identicon&s=25 William James (Guest)
on 2006-07-28 12:42
(Received via mailing list)
Kristof Bastiaensen wrote:

> -- extend takes a list of columns, and extends each column with a
>         where x free
>
> latin_square n = latin_square_ n
>     where latin_square_ 0 = replicate n []  -- initalize columns to nil
>           latin_square_ m | m > 0 = extend (latin_square_ (m-1)) n
>
> square2str s = unlines $ map format_col s
>     where format_col col = unwords $ map show col
>
> main = mapIO_ (putStrLn . square2str) (findall (\s -> s =:= latin_square 5))
> ------------------------- end latin.curry -----------------------------

I don't see where elems_diff is used after it is defined.
703fbc991fd63e0e1db54dca9ea31b53?d=identicon&s=25 Robert Dober (Guest)
on 2006-07-28 12:51
(Received via mailing list)
On 7/27/06, Isaac Gouy <igouy@yahoo.com> wrote:
> > JIT is the key to a lot of that. Performance depends greatly on
> > the compiler, the JVM, the algorithm, etc.
> >



> I won a bet once from a friend. We wrote comparable programs in
> > Java and C++ (some arbitrary math in a loop running a bazillion
> > times).
> >
> > With defaults on both compiles, the Java was actually *faster*
> > than the C++. Even I didn't expect that. But as I said, this
> > sort of thing is highly dependent on many different factors.
> >
> >



Yes I believe that on some theoretic base under some circumstances JIT
will
outclass a static compiler simply because it is able to optimize using
Runtime Information.
I would be surprised however, to have consistent better performance in
JIT
than in precompiled code in the near future - (ha near future can be
defined
to my needs, so I am never wrong ;) - let's say 0.01 ky.
Would you mind sharing these examples, maybe offlist I suggest as we are
loosing the red gem here.

Thx a lot in advance

Robert

>
--
Deux choses sont infinies : l'univers et la bêtise humaine ; en ce qui
concerne l'univers, je n'en ai pas acquis la certitude absolue.

- Albert Einstein
81cccab4619f8d8663e1e23b769f1515?d=identicon&s=25 Kristof Bastiaensen (Guest)
on 2006-07-28 15:06
(Received via mailing list)
On Fri, 28 Jul 2006 03:39:09 -0700, William James wrote:

>> -- same position
>>         | x =:= upto n &
>>
>> main = mapIO_ (putStrLn . square2str) (findall (\s -> s =:= latin_square 5))
>> ------------------------- end latin.curry -----------------------------
>
> I don't see where elems_diff is used after it is defined.

Heh, you are right!  I defined it, and then when I didn't need it I
forgot
to remove it.

Thanks for noting,
Kristof
3bb23e7770680ea44a2d79e6d10daaed?d=identicon&s=25 M. Edward (Ed) Borasky (Guest)
on 2006-07-28 15:30
(Received via mailing list)
Pit Capitain wrote:
> Maybe I should have written that giving that I'm using the One Click
> Installer, don't have the Windows compiler toolchain, and am not
> willing to use cygwin, I can't use Ruby Inline. Is this better?
Speaking of CygWin, a couple of people here have expressed what seems
like disdain for it. I am constrained to use a Windows desktop at my day
job, and CygWin has been an important factor in my retaining my sanity
about the fact. I don't use the server pieces of CygWin. My preference
(in open source tools) is first native Windows, second CygWin and third
(Gentoo) Linux. I was dual booted with Gentoo for a while until the
VMware Server beta started. That became a viable option so that's how I
exercise the third option now.

So what is the source of the reluctance to use CygWin in the Ruby
community?
81cccab4619f8d8663e1e23b769f1515?d=identicon&s=25 Kristof Bastiaensen (Guest)
on 2006-07-28 15:33
(Received via mailing list)
On Fri, 28 Jul 2006 09:45:01 +0000, Csaba Henk wrote:

>>
>>           (x `elem` col) =:= False = (x:col) : addnum cs (x:prev)
>> ------------------------- end latin.curry -----------------------------
>
> It's really nice and compact!
> AFAIK Curry is Haskell boosted with logic programming.

Yes, exactly!

>
> I -- who, ATM, just watches these languages from a distance, and can't
> tell it by looking at the code  -- wonder if have you used here
> something specific to Curry, which would be harder/uglier to express in
> Haskell?
>

Yes, the =:= operator unifies terms like in logic languages, and curry
makes it possible to write nondeterministic functions.  For example the
upto function I defined above can evaluate to any number from 1 upto n,
while in haskell it could have only one result. In the code that I wrote
above:
  upto n | n > 1 = n ? upto (n-1)

is the same as
  upto n | n > 1 = n
  upto n | n > 1 = upto (n-1)

Then there are search functions that make it possible to extract all
outcomes
from a nondeterministic function in a lazy way (i.e. findall)

In haskell the above would probably be written in a monad that expresses
nondeterminism, but I doubt it will be as clear as the Curry code.

> And how the Curry compiler looks like? Is it just a hacked GHC? How
> Curry performance relates to that of Haskell?
>

As far as I know the Curry compiler I used (Munster CC) is written from
scratch, in Haskell.  I doubt it is as fast and optimized as the Haskell
compiler, since Haskell has a much large userbase.

> Regards,
> Csaba

Regards,
Kristof
E698f564ac90c4c248f1f678caafd624?d=identicon&s=25 Keith Gaughan (Guest)
on 2006-07-28 22:30
(Received via mailing list)
On Fri, Jul 28, 2006 at 04:34:19PM +0900, Chad Perrin wrote:

> Gee willickers, I'm sorry I didn't use the exact phrasing you wanted me
> to.  Maybe next time, though, you won't claim the bits I said that
> didn't actually have anything to do with your actual complaint were
> wrong.
>
> Y'know, screw it.  Be an ass if you like.  I'm done with this subthread.

I was being polite until you made a passive agressive remark about me
being "less than sporting". But if you want to act like that, there's
not much I can do to stop you.
A90204c955db033cd975f7bb0ec9600b?d=identicon&s=25 Ashley Moran (Guest)
on 2006-07-28 22:36
(Received via mailing list)
On Thursday 27 July 2006 10:13, Ashley Moran wrote:
> > Good luck. I recently got a Ruby program that aggregates several LDAP
> > directory-pulls with about a million entries down from a few hours to a
> > few seconds, without having to drop into C. It can be done, and it's
> > kindof fun too.
>
> Next time I get a morning free I might apply some of the tweaks that have
> been suggested.  Might be interested to see how much  I can improve the
> performance.


I looked at the source of the script today, and I made these changes:

- use FasterCSV instead of CSV
- don't buffer every row in the datasets, send them straight to
Zlib:GzipWriter as they are processed
- don't do hash lookups in the middle of a 15 million row loop, do them
once
in advance!

Unfortunately I'm still stuck with a nasty "rows.each { |row| row.each {
|col|
col.strip! } }" type section, to fix the poor quality of the data, which
would take a lot of time going through all the fields to thin out.

Despite this, I've got the run time down from over 2.5 hours to about 50
minutes.  The smaller files are individually about 6x faster, but I'm
happy
with 3x faster overall.  It means we can realistically run it in the day
if
there are issues.

One curious thing is that while the real time was about 50 mins, the
user time
was only about 30 mins (negligible sys time if I remember).  Not sure
where
the other 20 mins has gone?

Ashley
E698f564ac90c4c248f1f678caafd624?d=identicon&s=25 Keith Gaughan (Guest)
on 2006-07-28 22:40
(Received via mailing list)
On Thu, Jul 27, 2006 at 02:42:30PM +0900, Chad Perrin wrote:
> > performance characteristics that *can* be divorced from Excel (because
> > they're Window's own performance characteristic, not Excel's). Argue
> > those points, and you're arguing about the wrong software.
>
> Design decisions that involve interfacing with interface software that
> sucks is related to the software under discussion -- and not all of the
> interface is entirely delegated to Windows, either.  No software can be
> evaluated for its performance characteristics separate from its
> environment except insofar as it runs without that environment.

Here's all I'm saying: the environment is important, but it's a variable
that must be cancelled when talking about some piece of software that's
running on top of it. You can only make judgements about the speed of
something like Excel by comparing it to another spreadsheet with a
similar set of features running on Windows. Otherwise, you're only
making guesses as to where the sluggishness and bloat lie.

> > But Wine is an emulator, and while it does a good job approaching the
> > speed of Windows, it doesn't hit it, nor can it hit it. You're not
> > comparing like with like. Now that's far from sporting.
>
> Actually, no, it's not an emulator.

Yes, it is. It's a set of libraries and executables that emulate a
Windows environment.

> It's a set of libraries (or a
> single library -- I'm a little sketchy on the details) that provides the
> same API as Windows software finds in a Windows environment.  An
> emulator actually creates a faux/copy version of the environment it's
> emulating.

Which both Wine and Cygwin do. To quote the Wikipedia article on
emulators:

    A software emulator allows computer programs to run on a platform
    (computer architecture and/or operating system) other than the one
for
    which they were originally written.

Linux compatibility on FreeBSD is a software emulator that fools Linux
executables into thinking they're running on Linux. Because of the
commonalities between FreeBSD and Linux, this emulation layer can be
thin.

> It is to Linux compared with Unix as an actual emulator is
> to Cygwin compared with Unix: one is a differing implementation and the
> other is an emulator.

?

> . . . and, in fact, there are things that run faster via Wine on Linux
> than natively on Windows.

Not surprising, really.

> [ snip ]
> > under FreeBSD. Bringing Wine in is a red herring. Software cannot be
> > blamed for the environment it's executed in.
>
> I didn't bring it up.  You did.  I made a comment about Excel not
> working in Linux as a bit of a joke, attempting to make the point that
> saying Excel performance can be evaluated separately from its dependence
> on Windows doesn't strike me as useful.

See above.
481b8eedcc884289756246e12d1869c1?d=identicon&s=25 Francis Cianfrocca (blackhedd)
on 2006-07-28 23:10
(Received via mailing list)
Again, I haven't seen your code or your data, but 50 minutes still seems
like a frightfully long time for such minimal processing on only 15
million
rows. If I were you, I'd keep optimizing (but then I'm pretty obsessive
about optimizing). Exactly how much time does GzipWriter take, for
example?

Several people have correctly said that you stop optimizing when your
program is "fast enough" (which is a business decision, not a technical
one), but optimizing strictly inside of Ruby is a lot cheaper than
optimizing by dropping into C (which I do often enough), because you
don't
incur the portability and other costs everyone has been talking about.
Fd22ee3cfc7dac283ce8e451af324f7d?d=identicon&s=25 Chad Perrin (Guest)
on 2006-07-29 00:03
(Received via mailing list)
On Sat, Jul 29, 2006 at 05:26:26AM +0900, Keith Gaughan wrote:
> being "less than sporting". But if you want to act like that, there's
> not much I can do to stop you.

That wasn't a passive-aggressive remark, it was a joking comment about
the inequity of your comparison (intentional or otherwise).  You're
welcome to your misconceptions and bad attitudes, though.
4feed660d3728526797edeb4f0467384?d=identicon&s=25 Bill Kelly (Guest)
on 2006-07-29 00:12
(Received via mailing list)
From: "Chad Perrin" <perrin@apotheon.com>
>> I was being polite until you made a passive agressive remark about me
>> being "less than sporting". But if you want to act like that, there's
>> not much I can do to stop you.
>
> That wasn't a passive-aggressive remark, it was a joking comment about
> the inequity of your comparison (intentional or otherwise).  You're
> welcome to your misconceptions and bad attitudes, though.

GENTLEMEN! YOU CAN'T FIGHT IN HERE, THIS IS THE WAR ROOM !!!
E698f564ac90c4c248f1f678caafd624?d=identicon&s=25 Keith Gaughan (Guest)
on 2006-07-29 00:16
(Received via mailing list)
On Sat, Jul 29, 2006 at 06:59:41AM +0900, Chad Perrin wrote:
> > I was being polite until you made a passive agressive remark about me
> > being "less than sporting". But if you want to act like that, there's
> > not much I can do to stop you.
>
> That wasn't a passive-aggressive remark, it was a joking comment about
> the inequity of your comparison (intentional or otherwise).

It didn't come across as joking.

> You're welcome to your misconceptions and bad attitudes, though.

Ditto. I wan't the one who wrote "Y'know, screw it. Be an ass if you
like." I hadn't even considered flipping the bozo bit until I read that.
F3b7109c91841c7106784d229418f5dd?d=identicon&s=25 Justin Collins (justincollins)
on 2006-07-29 00:16
(Received via mailing list)
Ashley Moran wrote:
>>
>
>
>
>
<snip>
> One curious thing is that while the real time was about 50 mins, the user time
> was only about 30 mins (negligible sys time if I remember).  Not sure where
> the other 20 mins has gone?
>
> Ashley
>

Disk I/O? Maybe?

-Justin
E698f564ac90c4c248f1f678caafd624?d=identicon&s=25 Keith Gaughan (Guest)
on 2006-07-29 00:19
(Received via mailing list)
On Sat, Jul 29, 2006 at 07:12:09AM +0900, Bill Kelly wrote:

> GENTLEMEN! YOU CAN'T FIGHT IN HERE, THIS IS THE WAR ROOM !!!

And I was just about to put my mexican wrestling mask on. Sigh. ;-)

K.
Fd22ee3cfc7dac283ce8e451af324f7d?d=identicon&s=25 Chad Perrin (Guest)
on 2006-07-29 00:32
(Received via mailing list)
On Sat, Jul 29, 2006 at 05:37:24AM +0900, Keith Gaughan wrote:
> running on top of it. You can only make judgements about the speed of
> something like Excel by comparing it to another spreadsheet with a
> similar set of features running on Windows. Otherwise, you're only
> making guesses as to where the sluggishness and bloat lie.

What's important is how two pieces of software run in the same
environment, not whether the environment is the reason a given
application is slow at some things.  That was my point: the GUI
performance of Excel is, indeed, relevant to a discussion of Excel
performance, despite the fact that significant chunks of Excel's GUI is
implemented by way of the environment.  Compare it with another
spreadsheet running in the same environment, and don't cancel some of
its slowness by blaming it on Windows.


> >
> > Actually, no, it's not an emulator.
>
> Yes, it is. It's a set of libraries and executables that emulate a
> Windows environment.

No, it's not.  Repeat after me:  "WINE Is Not an Emulator".  That's not
just an affectation.  It is a statement of fact about WINE.  That's why
they call it WINE.  An Windows emulator would be a "fake Windows"
running in Linux, like a VM: WINE is basically just an API that happens
to be as close to the Windows API (in all useful ways) as the WINE
developers can get it.  It does not pretend to be a Windows machine.  It
just provides compatibility for Windows programs on Linux.

Perhaps you aren't aware that WINE stands for WINE Is Not an Emulator,
or that they aren't lying whey they say that.


>
> > It is to Linux compared with Unix as an actual emulator is
> > to Cygwin compared with Unix: one is a differing implementation and the
> > other is an emulator.
>
> ?

(where ~ means "roughly equivalent to")

Differing implementations:
  Wine ~ Windows
  Linux ~ Unix

Emulators:
  Emulator != Original
  Cygwin != Linux
4299e35bacef054df40583da2d51edea?d=identicon&s=25 James Gray (bbazzarrakk)
on 2006-07-29 00:33
(Received via mailing list)
On Jul 28, 2006, at 3:33 PM, Ashley Moran wrote:

> I looked at the source of the script today, and I made these changes:
>
> - use FasterCSV instead of CSV

Fair warning, I'm coming into this conversation late and I haven't
read all that came before this.  However, if you are using FasterCSV...

> Unfortunately I'm still stuck with a nasty "rows.each { |row|
> row.each { |col|
> col.strip! } }" type section, to fix the poor quality of the data,
> which
> would take a lot of time going through all the fields to thin out.

FasterCS can convert fields as they are read.  I'm not sure if this
will be faster, but it may be worth a shot.  See the :converters
argument to FasterCSV::new:

http://fastercsv.rubyforge.org/

James Edward Gray II
Fd22ee3cfc7dac283ce8e451af324f7d?d=identicon&s=25 Chad Perrin (Guest)
on 2006-07-29 00:36
(Received via mailing list)
On Sat, Jul 29, 2006 at 07:13:11AM +0900, Keith Gaughan wrote:
> On Sat, Jul 29, 2006 at 06:59:41AM +0900, Chad Perrin wrote:
> >
> > That wasn't a passive-aggressive remark, it was a joking comment about
> > the inequity of your comparison (intentional or otherwise).
>
> It didn't come across as joking.

I'd be inclined to apologize for the misunderstanding if you hadn't
decided I was the devil incarnate for a mis-taken joke.


>
> > You're welcome to your misconceptions and bad attitudes, though.
>
> Ditto. I wan't the one who wrote "Y'know, screw it. Be an ass if you
> like." I hadn't even considered flipping the bozo bit until I read that.

Reread what you said in the preceding posts and tell me if you wouldn't
have the same reaction to someone flying off the damned handle at a
stupid joke.  I tried to inject levity because I could see the
conversation heading in a bad direction, and you were so intent on
seeing me in a bad light that it didn't occur to you to assume good
faith on my part.  Congratulations.
8d6f5daee16e380ce0ac00395b417fb6?d=identicon&s=25 Schüle Daniel (Guest)
on 2006-07-29 01:16
(Received via mailing list)
Peter Hickman schrieb:
> Whenever the question of performance comes up with scripting languages
> such as Ruby, Perl or Python there will be people whose response can be
> summarised as "Write it in C". I am one such person. Some people take
> offence at this and label us trolls or heretics of the true programming
> language (take your pick).
>
> I am assuming here that when people talk about performance they really
> mean speed. Some will disagree but this is what I am talking about.

write in VHDL (or Verilog or SystemC), synthesize you piece of
hardware, plugin into PCI, have fun with most performant (in terms of
speed)and efficient (in terms of energy consumption) solution
:)

[...]

> So what am I recommending here, write all your programs in C? No. Write
> all your programs in Perl? No. Write them in your favourite scripting
> language to refine the code and then translate it into C if the
> performance falls short of your requirements. Even if you intend to
> write it in C all along hacking the code in Perl first allows you to
> play with the algorithm without having to worry about memory allocation
> and other such C style house keeping. Good code is good code in any
> language.

I like C, but in this context C can be replaced with any compiled
language.

> If you really really want that performance boost then take the following
> advice very seriously - "Write it in C".

from my point of view C has the only advantage of running on nearly
every platform. As I said above any of compiled languages would speed up
the code.

my 2 cents
Regards, Daniel
E698f564ac90c4c248f1f678caafd624?d=identicon&s=25 Keith Gaughan (Guest)
on 2006-07-29 01:48
(Received via mailing list)
Chad, do you mind if we take this off list? There's no sense in either
of us cluttering up the list with an offtopic discussion.

K.
Fd22ee3cfc7dac283ce8e451af324f7d?d=identicon&s=25 Chad Perrin (Guest)
on 2006-07-29 01:57
(Received via mailing list)
On Sat, Jul 29, 2006 at 08:47:44AM +0900, Keith Gaughan wrote:
> Chad, do you mind if we take this off list? There's no sense in either
> of us cluttering up the list with an offtopic discussion.

go for it
5a837592409354297424994e8d62f722?d=identicon&s=25 Ryan Davis (Guest)
on 2006-07-29 02:01
(Received via mailing list)
On Jul 28, 2006, at 2:34 AM, Pit Capitain wrote:

>
> gave me no usable hint among the first 50 results besides using
> cygwin. Do you have a link to the reports you mention?
>
> Maybe I should have written that giving that I'm using the One
> Click Installer, don't have the Windows compiler toolchain, and am
> not willing to use cygwin, I can't use Ruby Inline. Is this better?

It should work with the 1-click installer, but yeah... not without a
compiler. so there is no way to even expect it to work.

> But, instead I might try to use Ruby Inline with MinGW, so thanks
> for the question.

I've not used mingw, and have no idea if others have.
31ab75f7ddda241830659630746cdd3a?d=identicon&s=25 Austin Ziegler (austin)
on 2006-08-04 02:22
(Received via mailing list)
On 7/28/06, M. Edward (Ed) Borasky <znmeb@cesmail.net> wrote:
> So what is the source of the reluctance to use CygWin in the Ruby community?

It's crap and doesn't mesh well with Windows itself.

I have even recently (mostly) dumped it in favour of xming, because I
was *only* using cygwin for X services.

-austin
3bb23e7770680ea44a2d79e6d10daaed?d=identicon&s=25 M. Edward (Ed) Borasky (Guest)
on 2006-08-04 05:16
(Received via mailing list)
Austin Ziegler wrote:
> On 7/28/06, M. Edward (Ed) Borasky <znmeb@cesmail.net> wrote:
>> So what is the source of the reluctance to use CygWin in the Ruby
>> community?
>
> It's crap and doesn't mesh well with Windows itself.
>
> I have even recently (mostly) dumped it in favour of xming, because I
> was *only* using cygwin for X services.
>
> -austin
Ah ... I only use the client-side stuff, not the server "emulations". I
use the CygWin Perl extensively, plus the command line and the X server.
Once in a while, I need an open source piece of software that doesn't
have a native Windows build ... at that point my first attempt is to
compile and run it under CygWin. If that fails, I have a Gentoo Linux
VMware virtual machine I use.

I suppose I should try Xming ... as long as it has a usable command
line, I can get to ActiveState Perl.
C914fa463a2b1b067586c6432b12a824?d=identicon&s=25 =?iso-8859-1?Q?J=FCrgen?= Strobel (Guest)
on 2006-08-08 14:20
(Received via mailing list)
On Thu, Jul 27, 2006 at 09:26:45AM +0900, David Pollak wrote:
> >Is that heavily optimized Java vs. "normal" (untweaked) C?
> sometimes billions) of Dollars, Euros, etc. through their spreadsheets
> every day.  A 5 or 10 second advantage in calculating a spreadsheet
> could mean a significant profit for a trading firm.
>
> So, I am comparing apples to apples.  A Java program can be optimized
> to perform as well as a C program for *certain* tasks.

I/O bound tasks? Certain artificial benchmarks? Well yes, ruby can
perform as well as C on those too, but that's hardly the point
here. This thread's topic is not if a few Java programs can be made
as fast as C.

The point is, "write it in C" is a valid general advice, but
"write it in Java" depends on a lot of factors noone of the Java crowd
mentions beforehand, but wants to be taken into account afterwards
when comparing performance. You can't have it both ways.

**

Anyhow, I doubt the apples to apples and JIT argument. You can write
self modifying C code to optimize at run time too. Now it is a bit far
fetched to write a full JIT compiler in C just for your project. But,
if there is opportunity for dynamic optimization at run time which the
Java JIT compiler can take advantage of, there has to be an
opportunity for run time optimization in C too. I don't say this is
easy to spot or implement, just possible.

After all, Java VMs are still written in C, aren't they?

-Jürgen
36c10e2de5115d5729e12676727d0b71?d=identicon&s=25 unknown (Guest)
on 2006-08-21 08:56
(Received via mailing list)
Ok, I need to preface this by saying that I'm in no way either a C or
ruby guru, so I may have missed something here, but this is what I'm
seeing. The bottle-neck here is in the printing to screen.

I don't see how the OP got the code to run in 5 seconds while using any
stdout writing function (e.g., printf). The program should print like 4
megs of text to the screen (though I couldn't get the C that was posted
to compile--like I said, I'm no C superstar) -- there's no way that is
happening in 5 seconds.

Taking a very simple test case, which just prints a 10 byte char array
15000 times using puts:

--------

#include <stdio.h>
#define SIZE 15000
int main() {
  char str[11]   = "ABCDEFGHIJ";
  int i;
  for (i = 0; i < SIZE; ++i) {
    puts(str);
  }
  printf("Printed %i bytes of data\n\0", SIZE*strlen(str));
  return 0;
}

--------

Compiled with:

gcc -o test test.c

This yields:

time ./test

ABCDEFGHIJ
ABCDEFGHIJ
...
ABCDEFGHIJ
Printed 150000 bytes of data

real    0m25.621s
user    0m0.010s
sys     0m0.029s


Now, let's see how Ruby's stdout writing stacks up:

--------

#!/usr/bin/ruby -w
SIZE = 15000
str = "ABCDEFGHIJ"
1.upto(SIZE) {
  STDOUT.syswrite("#{str}\n")
}
STDOUT.syswrite("Printed #{SIZE*str.length} bytes of data\n")

--------

This yields:

ABCDEFGHIJ
ABCDEFGHIJ
...
ABCDEFGHIJ
Printed 150000 bytes of data

real    0m26.796s
user    0m0.202s
sys     0m0.049s

Pretty comparable there.


Of course, the bottle-neck was *supposedly* the maths and array access
and such, which is where C would excel (I'm not denying that ruby is
sometimes pretty slow, or that C is *much* faster all around, just bare
with me here).


So with one ruby implementation of the program mentioned on here:

--------

#!/usr/bin/ruby -w

Wd = (ARGV.shift || 5).to_i
$board = []

# Generate all possible valid rows.
Rows = (0...Wd ** Wd)\
       .map { |n| n.to_s(Wd)\
       .rjust(Wd,'0') }\
       .reject{ |s| s =~ /(.).*\1/ }\
       .map { |s| s.split(//)\
                  .map { |n| n.to_i + 1 }
       }

def check (ary, n)
  ary[0, n + 1].transpose.all? { |x| x.size == x.uniq.size }
end

def add_a_row (row_num)
  if (Wd == row_num)
    STDOUT.syswrite($board.map { |row| row.join }.join(':'))
  else
    Rows.size.times { |i|
      $board[row_num] = Rows[i]
      if (check($board, row_num))
        add_a_row(row_num + 1)
      end
    }
  end
end

add_a_row(0)

---------

This took like 48 minutes! Ouch! But if that syswrite (or puts in the
original version) is replaced with a file write:

--------

def add_a_row (row_num)
  if Wd == row_num
    $outfile << $board.map { |row| row.join }.join(':')
  else
    Rows.size.times { |i|
      $board[row_num] = Rows[i]
      if (check($board, row_num))
        add_a_row(row_num + 1)
      end
    }
  end
end

$outfile = File.open('latins.dump', 'wb')
add_a_row(0)
$outfile.close

--------

Now we're down to 16 minutes. Much better! But still...


Ok, so what about the implementation using the permutation and set
classes?

--------

#!/usr/bin/ruby -w

require('permutation')
require('set')

$size = (ARGV.shift || 5).to_i

$perms = Permutation.new($size).map { |p| p.value }
$out = $perms.map { |p| p.map { |v| v+1 }.join }
$filter = $perms.map { |p|
  s = SortedSet.new
  $perms.each_with_index { |o, i|
    o.each_with_index { |v, j| s.add(i) if p[j] == v }
  } && s.to_a
}

$latins = []
def search lines, possibs
  return $latins << lines if lines.size == $size
  possibs.each { |p|
    search lines + [p], (possibs -
                         $filter[p]).subtract(lines.last.to_i..p)
  }
end

search [], SortedSet[*(0...$perms.size)]

$latins.each { |latin|
  $perms.each { |perm|
    perm.each { |p| STDOUT.syswrite($out[latin[p]] + "\n") }
    STDOUT.syswrite("\n")
  }
}

--------

26 minutes...that's still gonna leave a mark. But the file IO version
of the same program?

--------

outfile = File.open('latins.dump', 'wb')
$latins.each { |latin|
  $perms.each { |perm|
    perm.each { |p| outfile << $out[latin[p]] << "\n" }
    outfile << "\n"
  }
}
outfile.close

--------

time ruby test.rb

real    0m17.227s
user    0m13.969s
sys     0m1.399s

17 seconds! WOOHOO! Yes, indeedy. That's more like it. Try it if you
don't believe me.



So the moral of the story is twofold:

1.) Don't assume the bottle-neck is in the interpreter and just run off
and start writing everything in C or Java or simplified Klingon or
whatever. Testing and coverage and profiling is the key to discovering
the true cause of your woes. Here the bottle-neck was in the writing 4+
megs to stdout -- and going to C won't help for that, contra the claims
of the OP.

2.) Don't write a crapload of text to stdout. It won't be as fast as
you'd like it to be no matter what language you use.


========

NB: All testing was done on:

Linux 2.6.17-gentoo-r5 #2 PREEMPT Thu Aug 10 14:21:37 CDT 2006 i686 AMD
Athlon(tm) XP 1600+ AuthenticAMD GNU/Linux

gcc version 3.4.6 (Gentoo 3.4.6-r2, ssp-3.4.6-1.0, pie-8.7.9)

GNU C Library development release version 2.4

ruby 1.8.5 (2006-08-18) [i686-linux]

========


Regards,
Jordan
C914fa463a2b1b067586c6432b12a824?d=identicon&s=25 =?iso-8859-1?Q?J=FCrgen?= Strobel (Guest)
on 2006-08-23 02:10
(Received via mailing list)
On Mon, Aug 21, 2006 at 03:55:10PM +0900, MonkeeSage@gmail.com wrote:
> Ok, I need to preface this by saying that I'm in no way either a C or
> ruby guru, so I may have missed something here, but this is what I'm
> seeing. The bottle-neck here is in the printing to screen.
>
> I don't see how the OP got the code to run in 5 seconds while using any
> stdout writing function (e.g., printf). The program should print like 4
> megs of text to the screen (though I couldn't get the C that was posted
> to compile--like I said, I'm no C superstar) -- there's no way that is
> happening in 5 seconds.

First, stdout doesn't neccesarily write to your screen, and /dev/null
is very very fast. Even if writing to some kind of terminal emulator,
some of them are much faster with bulk output than others.

>   int i;
>
> Printed 150000 bytes of data
> #!/usr/bin/ruby -w
>
> Pretty comparable there.
>
>
> Of course, the bottle-neck was *supposedly* the maths and array access
> and such, which is where C would excel (I'm not denying that ruby is
> sometimes pretty slow, or that C is *much* faster all around, just bare
> with me here).

Not so. Let's look at the *user* numbers which represent CPU time used
by the programs themselves. The difference between 0.010 and 0.202s is
pretty large (2000%), but maybe the values are still too small to
disregard startup time etc.

99.999% of the *real* execution time your program was not even running,
but waiting, presumably for a slow output system to write your strings.

All you did benchmark was that your terminal emulator (or whereever
your stdout goes to) is slow.

I suggest to increase SIZE and redirect stdout to the bitbucket (or a
file if you lack a decent terminal) and want to compare only the C and
ruby versions again. The difference will be much more visible even in
real execution times then.

Jürgen
47b36de21d7ecbc824c81d24802a6290?d=identicon&s=25 Minkoo Seo (pool007)
on 2006-08-24 15:05
(Received via mailing list)
Well put. I totally agree with you.

I'd like to mention that you can use inline for this purpose.
Have a look at
http://on-ruby.blogspot.com/2006/07/rubyinline-mak...

In the article, ruby and inlined c version of prime nubmer checking is
compared. As expected, inlined C runs very fast compared to ruby only
version.

I also have an experience of porting to C++. My application
which heavily relied on matrix and log functions took more than
2 days to finish. After I've ported it to C++, it took about 6hrs.
Yes, it was amazing performance boost.

Sincerely,
Minkoo Seo
36c10e2de5115d5729e12676727d0b71?d=identicon&s=25 Jordan Callicoat (monkeesage)
on 2006-08-26 16:23
Jürgen Strobel wrote:
> On Mon, Aug 21, 2006 at 03:55:10PM +0900, MonkeeSage@gmail.com wrote:
>> Ok, I need to preface this by saying that I'm in no way either a C or
>> ruby guru, so I may have missed something here, but this is what I'm
>> seeing. The bottle-neck here is in the printing to screen.
>>
>> I don't see how the OP got the code to run in 5 seconds while using any
>> stdout writing function (e.g., printf). The program should print like 4
>> megs of text to the screen (though I couldn't get the C that was posted
>> to compile--like I said, I'm no C superstar) -- there's no way that is
>> happening in 5 seconds.
>
> First, stdout doesn't neccesarily write to your screen, and /dev/null
> is very very fast. Even if writing to some kind of terminal emulator,
> some of them are much faster with bulk output than others.

Ok, fair enough. I made the assumption that the testing was done on a
terminal emulator since the perl execution time was given as 12 minutes
and the faster version of the ruby algorithm writing to a file rather
than stdout took only 17 seconds, whereas it took 26 minutes writing to
stdout--I assumed that the perl version would be comparable with file
writing rather than writing to standard out. I used xterm.

> All you did benchmark was that your terminal emulator (or whereever
> your stdout goes to) is slow.

Yes! That's exactly what I was trying to do! I was trying to show that
writing to stdout (assuming a comparable context, like an xterm) is the
bottle-neck of the ruby version here. Writing to a file was about 90
times faster. Like I said, I wasn't denying that C is faster; my point
was this: the C version may only take 5 seconds to write to a *file*,
but ruby only takes 17 seconds; and for me, as I'm sure for others, 12
seconds is not worth the effort of writing it in C. So the 12 minutes to
5 seconds comparison was flawed -- 17 seconds to 5 seconds is more
accurate.

Regards,
Jordan
C914fa463a2b1b067586c6432b12a824?d=identicon&s=25 =?iso-8859-1?Q?J=FCrgen?= Strobel (Guest)
on 2006-08-29 04:37
(Received via mailing list)
On Sat, Aug 26, 2006 at 11:23:02PM +0900, Jordan Callicoat wrote:
> Jürgen Strobel wrote:

> Yes! That's exactly what I was trying to do! I was trying to show that
> writing to stdout (assuming a comparable context, like an xterm) is the
> bottle-neck of the ruby version here. Writing to a file was about 90
> times faster. Like I said, I wasn't denying that C is faster; my point
> was this: the C version may only take 5 seconds to write to a *file*,
> but ruby only takes 17 seconds; and for me, as I'm sure for others, 12
> seconds is not worth the effort of writing it in C. So the 12 minutes to
> 5 seconds comparison was flawed -- 17 seconds to 5 seconds is more
> accurate.

Usually you don't benchmark with external arbitrary bottlenecks
enabled. The point is, *your* xterm may be of very different speed
than mine. Generic terminal emulators are not a comparable context at
all, not without benchmarking them themselves, which is not the point
of this exercise.

I should have emphasized *your terminal emulator*. We are comparing
ruby vs. C, so it makes sense to assume a fast and uniform output
mechanism which doesn't get in our way. It is standard practice to
write to stdout, but redirect it to /dev/null or files if writeing
larger quantities in benchmarks.

To get real numbers, your C version took 0.010s CPU time, and your
ruby one took 0.202s. A difference of roughly 2000% may well be
important to several applications. Sometimes it may be completely
irrelevant, for instance if CPU time is dwarfed by I/O.

-Jürgen
36c10e2de5115d5729e12676727d0b71?d=identicon&s=25 Jordan Callicoat (monkeesage)
on 2006-08-29 09:39
=?iso-8859-1?Q?J=FCrgen?= Strobel wrote:
> On Sat, Aug 26, 2006 at 11:23:02PM +0900, Jordan Callicoat wrote:
>> Jürgen Strobel wrote:
>
>> Yes! That's exactly what I was trying to do! I was trying to show that
>> writing to stdout (assuming a comparable context, like an xterm) is the
>> bottle-neck of the ruby version here. Writing to a file was about 90
>> times faster. Like I said, I wasn't denying that C is faster; my point
>> was this: the C version may only take 5 seconds to write to a *file*,
>> but ruby only takes 17 seconds; and for me, as I'm sure for others, 12
>> seconds is not worth the effort of writing it in C. So the 12 minutes to
>> 5 seconds comparison was flawed -- 17 seconds to 5 seconds is more
>> accurate.
>
> Usually you don't benchmark with external arbitrary bottlenecks
> enabled. The point is, *your* xterm may be of very different speed
> than mine. Generic terminal emulators are not a comparable context at
> all, not without benchmarking them themselves, which is not the point
> of this exercise.
>
> I should have emphasized *your terminal emulator*. We are comparing
> ruby vs. C, so it makes sense to assume a fast and uniform output
> mechanism which doesn't get in our way. It is standard practice to
> write to stdout, but redirect it to /dev/null or files if writeing
> larger quantities in benchmarks.
>
> To get real numbers, your C version took 0.010s CPU time, and your
> ruby one took 0.202s. A difference of roughly 2000% may well be
> important to several applications. Sometimes it may be completely
> irrelevant, for instance if CPU time is dwarfed by I/O.
>
> -Jürgen

I understand your point, but I think we may be talking past each other
here. My purpose was to show that the ruby (and assumedly perl)
bottle-neck was due other factors than the interpreter. Granted, ruby is
slower than C -- even a lot slower (2000%!) -- but the end result of
writing to a *file* is 17 seconds in ruby. That is light years removed
from 26 (or even 12) minutes (i.e., writing to stdout with no
redirection). So while 12 minutes versus 5 seconds may make you want to
write it in C; 17 seconds versus 5 seconds may make you think twice
about it. My question is, when is C a *necessary* evil, rather than just
an evil. ;)

Regards,
Jordan
This topic is locked and can not be replied to.