Intensive computing: Ruby? Ruby/C? Pure C++?

Shot_SPiotr_S.S · January 17, 2007, 11:00am

Hello, ruby-talk. I have a question about choosing between Ruby, Ruby
with C extensions (InlineRuby, ruby2c, etc.) and pure C++. I’ll try to
keep it short and to the point; thanks in advance for any answers, and
of course I’m more than happy to clarify/extend the below if needed.

Background: I’m a PhD student at Warsaw University of Technology and
I need to write a project that’s going to be fairly intensive in both
computing and memory usage (symbolic functional decomposition of FSMs
for FPGA implementation); in particular, the project will implement
a lot of set operations and graph colouring. I like Ruby a lot, and
the project has to work first, but ultimately it has to be efficient
in the end. I don’t know a lot about C/C++ other than the basic classes
I took years ago (i.e., I’ll have to learn C/C++ from scratch anyway).

The question: in my case, which approach seems to make most sense?

Pure Ruby. This will most probably not be an option; my gut
feeling is that pure Ruby will end up being way too slow and
memory-inefficient. Still, definitely the fun approach, and actually
maybe the fastest prototyping.
Ruby with C extensions. Code in Ruby first, then optimise the slow
parts one after another. This is tempting (mmm, Ruby…), but I’m
not sure how viable; if I’m going to have to rewrite 90% in C anyway,
I might as well start with C++ (again, only a gut feeling).
Pure C++. I have to learn C/C++ anyway almost certainly (for point
2 above), so maybe trying to have fun and going with Ruby is not the
best way this time, and I should simply install CDT, stock on C++
tutorials and be a brave, if sad, person?

Note: My supervisor doesn’t know Ruby one bit and writes everything in
C++, so he would not be too happy with 1 and 2; still, it’s my PhD, and
he’s an open-minded person, so if I’m sure about either 1 or 2 being
viable I should be able to go with Ruby.

Thanks a lot for your time and in advance for any responses.

– Shot

Shot_SPiotr_S.S · January 17, 2007, 11:47am

Thanks a lot for your time and in advance for any responses.

I’d go with prototype in Ruby, probably.

You could jump in to c++, and I think you may find useful Boost
libraries for the work you’re going (there is a graph librarby in there,
I’m sure), but I honestly think you’ll suffer if you’re not a c++ Guru,
particularly if you’re using template based libraries (others will
probably differ, but my feeling is using c++ well is very hard, and I’m
paid to do it ;-).

After prototyping, a fourth way that I would strongly suggest in this
case, is to considder a functional language if you need speed. That may
sound crazy, but ocaml [1] (specifically) performs comparably (half as
fast?) as hand tuned c on benchmarks [2] (lies, damned lies and
statistics?). It’s almost certainly very well suited to the kind of work
you’re talking about (symbolic manipulation), and your brain may enjoy
molding to the mind set if it’s already got Ruby nestled in there.

If the prototype is going to be a large lump of work, you might even
want to considder starting out with the functional language
(particularly if you can find example code for a similar looking
problem).

All the best!

Cheers,
Benjohn

[1] OCaml
[2] http://www.cs.ubc.ca/~murphyk/Software/Ocaml/why_ocaml.html

Shot_SPiotr_S.S · January 17, 2007, 11:58am

P.S. You may also find that there are languages or tools out there that
are very well focused to your domain, in which case I’d very strongly
advise you to make use of them if at all possible. Is your PhD about
writing code to solve this problem; or about exploring the complexity
and implications of algorithms; or even about actually making use of the
algorithm? It all depends on what your goal is, really.

Shot_SPiotr_S.S · January 17, 2007, 4:30pm

Shot (Piotr S.) wrote:

the project has to work first, but ultimately it has to be efficient
in the end. I don’t know a lot about C/C++ other than the basic classes
I took years ago (i.e., I’ll have to learn C/C++ from scratch anyway).

Are you allowed to use existing open source C/C++ libraries for the
low-level operations, or do you need to implement the whole thing
yourself? I’m sure there are open source libraries, or at least
“academic licensed” libraries that you, being a PhD student, would be
able to use. Unless your advisor wants you to learn how to implement all
this low-level stuff, I’d recommend a hybrid approach – find some
libraries that do the hard stuff and wrap them in Ruby using SWIG.

The question: in my case, which approach seems to make most sense?

Pure Ruby. This will most probably not be an option; my gut
feeling is that pure Ruby will end up being way too slow and
memory-inefficient. Still, definitely the fun approach, and actually
maybe the fastest prototyping.

Do you know Lisp or Scheme? For this type of application, prototyping in
one of them might well be faster than in Ruby. And quite a few Scheme
systems are pretty close to the bare metal / C code, so you’d be able to
do the optimization.

Ruby with C extensions. Code in Ruby first, then optimise the slow
parts one after another. This is tempting (mmm, Ruby…), but I’m
not sure how viable; if I’m going to have to rewrite 90% in C anyway,
I might as well start with C++ (again, only a gut feeling).

This is probably where you’ll end up. However, as a general rule,
perhaps 20 percent of the code will need to be in C or C++, not 90
percent. And as I noted above, if you’re allowed to use existing C/C++
libraries, you might not have to write anything except SWIG .i
(interface definition) files.

Pure C++. I have to learn C/C++ anyway almost certainly (for point
2 above), so maybe trying to have fun and going with Ruby is not the
best way this time, and I should simply install CDT, stock on C++
tutorials and be a brave, if sad, person?

It probably won’t come to this, but if it does, well, C++ programming is
a very marketable skill once you have your PhD.

–
M. Edward (Ed) Borasky, FBG, AB, PTA, PGS, MS, MNLP, NST, ACMC(P)
http://borasky-research.blogspot.com/

If God had meant for carrots to be eaten cooked, He would have given
rabbits fire.

Shot_SPiotr_S.S · January 17, 2007, 6:48pm

On Jan 17, 2007, at 2:00 AM, Shot (Piotr S.) wrote:

Background: I’m a PhD student at Warsaw University of Technology and
I need to write a project that’s going to be fairly intensive in both
computing and memory usage (symbolic functional decomposition of FSMs
for FPGA implementation); […]

Some work has been done in this area to some extent. Check out RHDL
from Phil T… I believe it is pure ruby only.

The question: in my case, which approach seems to make most sense?

Pure Ruby. This will most probably not be an option; my gut
feeling is that pure Ruby will end up being way too slow and
memory-inefficient. Still, definitely the fun approach, and actually
maybe the fastest prototyping.

Gut feelings are often wrong. Especially in this arena. You can also
take advantage of very easy parallelism available through systems
like rinda.

Ruby with C extensions. Code in Ruby first, then optimise the slow
parts one after another. This is tempting (mmm, Ruby…), but I’m
not sure how viable; if I’m going to have to rewrite 90% in C anyway,
I might as well start with C++ (again, only a gut feeling).

This is the approach I choose most of the time. Just Do It. If it is
slow, make the slow bits faster. It is exactly why RubyInline exists
in the first place. Just make sure you do good profiling (use shugo’s
rubyprof or my zenprofiler–soon to be repackaged separately. Do NOT
use the native one as anything more than a cursory glance, especially
if your code does a ton of small method calls).

Pure C++. I have to learn C/C++ anyway almost certainly (for point
2 above), so maybe trying to have fun and going with Ruby is not the
best way this time, and I should simply install CDT, stock on C++
tutorials and be a brave, if sad, person?

Blech. Ptooey! I don’t even consider this an option anymore.

Shot_SPiotr_S.S · January 17, 2007, 7:05pm

Thanks a lot for taking the time to reply, Benjohn!

[email protected]:

I’d go with prototype in Ruby, probably.

And then…? I hoped a bit for the dream solution of “actually, dear
professors, it works - if slow - for the scope of this thesis, and
further speed improvements can be achieved by replacing more parts
with C/C++ code”.

You could jump in to c++, and I think you may find useful Boost
libraries for the work you’re going (there is a graph librarby in
there, I’m sure)

Yeah, I heard good things about Boost, and also that wrapping one’s mind
around C++ templating is a requirement these days. Of which, the latter
scares me (and I seem to know my limits quite well)…

but I honestly think you’ll suffer if you’re not a c++ Guru

My thoughts exactly. The catch is my supervisor codes in C++ all his
life, while not having any idea about script languages, so he doesn’t
really seem to easily buy mine ‘I’ll code both faster and be way more
happy with Ruby/OCaml, and the finished libraries will be more-or-less
universal anyway.’

After prototyping, a fourth way that I would strongly suggest in
this case, is to considder a functional language if you need speed.

The catch with going OCaml (where did today go? all these
easily-googlable tutorials and opinions were so nice to read…)
is that I’ll have to learn yet another language (and, more importantly,
design paradigm).

What I forgot to mention, my supervisor suggests I could re-use a lot of
his C++ base classes; I wonder how well (objective) C++ integrates with
Ruby/OCaml (if at all).

It’s almost certainly very well suited to the kind of work you’re
talking about (symbolic manipulation), and your brain may enjoy
molding to the mind set if it’s already got Ruby nestled in there.

I’ll give it an afternoon or two, thanks a lot for the suggestion! The
catch is that my supervisor’s brain would also have to mold accordingly,
and that might be the tricky part for a C+±only person (‘functional
programming? in 2007?! can’t you just code in C++ like all the normal
people?’).

P.S. You may also find that there are languages or tools out there
that are very well focused to your domain, in which case I’d very
strongly advise you to make use of them if at all possible.

Definitely. Part of the problem is that I’m not yet
sure what domains will the final alogrithm cover…

Is your PhD about writing code to solve this problem; or about
exploring the complexity and implications of algorithms; or even
about actually making use of the algorithm?

It’s about (a) coming up with an algorithm (so far it seems it’ll
be mostly based on blanket algebra, so set-based operations) and
(b) implementing the algorithm in any way I see fit (although one
that’s graspable and reusable by others is expected, so I’m a bit
affraid with going OCaml here; still, any non-C++ solution will be
a slap to the face, I guess).

– Shot

Shot_SPiotr_S.S · January 17, 2007, 7:45pm

Thanks a lot for your reply, Ed!

M. Edward (Ed) Borasky:

Are you allowed to use existing open source C/C++ libraries for the
low-level operations, or do you need to implement the whole thing
yourself?

I’m allowed to use anything I want. My supervisor created a couple
of C++ libraries that he thinks I might reuse (but I don’t have to).

I’d recommend a hybrid approach – find some libraries
that do the hard stuff and wrap them in Ruby using SWIG.

Thanks for pointing out SWIG (I’m still a total newbie in the whole
Ruby/OCaml <-> C/C++ interface possibilities, unfortunately; I should
read a lot more on this, but have to make some desions this weekend…).

I’ll do all the background reading, of course, but a quick question:
assuming that I have some (objective) C++ classes already, how
easy would it be to use them in Ruby/OCaml? (So far I thought the
C integration was doable, but not necessarily C++, hence my a-bit-blind
shooting-in-the-fog quetions.)

Do you know Lisp or Scheme?

Neither, unfortunately. Learning Lisp is on my to-do-after-hours list
for a long time already, but it turns out having a full-time IT job next
to the PhD studies means the rare free time I have is best spent away
from the computer, unless I want to lost the rest of my social skills.

I’d love to finally have an excuse to learn Lisp, but I’m affraid I can
only either choose from bringing my Ruby skills up a notch and learning
enough C/C++ to fill in the gaps or commit myself full-time to C++.
(That is, until Benjohn pointed-out OCaml; hmmm…)

Shot (Piotr S.) wrote:

Ruby with C extensions.

This is probably where you’ll end up. However, as a general rule,
perhaps 20 percent of the code will need to be in C or C++, not
90 percent.

Hmmm, that’s tempting. Ruby and me somehow click together nicely, in the
‘hey, after two months I still know what I thought writing these!’ way
that doesn’t seem to happen so much with other languages.

And as I noted above, if you’re allowed to use existing C/C++
libraries, you might not have to write anything except SWIG .i
(interface definition) files.

Tempting, again. It seems I’ll read up on SWIG a bit this weekend.

Pure C++

It probably won’t come to this, but if it does, well, C++
programming is a very marketable skill once you have your PhD.

Erm, ye-ah, but… My best job so far is the (current) CiviCRM
web-developement (PHP), and while working with Ruby on Rails instead
would definitely be an improvement, working with C++… well, wouldn’t,
to say the least.

(Or maybe I’m just too stupid for the whole mess of header files,
makefiles, compilations, Windows/Linux almost-portability and cryptic
runtime errors of everything I ever wrote in C/C++. Or, hopefully,
simply too spoiled by Ruby.)

– Shot

Shot_SPiotr_S.S · January 17, 2007, 8:02pm

On Thu, 18 Jan 2007, Shot (Piotr S.) wrote:

I’d recommend a hybrid approach – find some libraries
that do the hard stuff and wrap them in Ruby using SWIG.

Thanks for pointing out SWIG (I’m still a total newbie in the whole
Ruby/OCaml ↔ C/C++ interface possibilities, unfortunately; I should
read a lot more on this, but have to make some desions this weekend…).

http://sciruby.codeforpeople.com/sr.cgi/RubyRamblings/Ramblings_0
http://sciruby.codeforpeople.com/sr.cgi/ProjectIdeas/RubyOCaml

Tempting, again. It seems I’ll read up on SWIG a bit this weekend.

don’t forget about ruby/dl. you can simply call c functions directory
with
zero glue code that way.

-a

Shot_SPiotr_S.S · January 18, 2007, 7:10am

William J. wrote:

with C/C++ code".

easily-googlable tutorials and opinions were so nice to read…)
is that I’ll have to learn yet another language (and, more importantly,
design paradigm).

LuaJIT probably won’t be as fast as OCaml, but I think Lua
is easier for a Rubyist to learn.

Comparing Ruby to Lua and LuaJIT for intensively computing primes:

Ruby 245.763 seconds
Lua 10.685 seconds
LuaJIT 1.311 seconds

the Ruby program

def prime(n)
if n > 2 and n % 2 == 0
return false
end
3.step( Math.sqrt(n).floor, 2){|i|
if n % i == 0
return false
end
}
true
end

puts prime(2)
puts prime(4)
puts prime(7)
puts prime(25)
puts ‘----- the rest are primes’
time = Time.now
puts prime(1073676287)
puts prime(68718952447)
puts prime(87178291199)
puts prime(274877906899)
puts prime(549755813881)
puts prime(1099511627689)
puts prime(2199023255531)
puts prime(4398046511093)
puts prime(8796093022151)
puts prime(17592186044399)
puts prime(953467954114363)
puts Time.now - time

– the Lua program
function prime(n)
if n > 2 and n % 2 == 0 then
return false
end
for i = 3, math.floor(math.sqrt(n)), 2 do
if n % i == 0 then
return false
end
end
return true
end

print(prime(2))
print(prime(4))
print(prime(7))
print(prime(25))
print(’----- the rest are primes’)
time = os.clock()
print(prime(1073676287))
print(prime(68718952447))
print(prime(87178291199))
print(prime(274877906899))
print(prime(549755813881))
print(prime(1099511627689))
print(prime(2199023255531))
print(prime(4398046511093))
print(prime(8796093022151))
print(prime(17592186044399))
print(prime(953467954114363))
print(os.clock() - time)

Shot_SPiotr_S.S · January 18, 2007, 1:30pm

It sounds like you need to know C++, if only for political reasons.
This being so, buy a reasonable into book now, and, most importantly,
Scott Meyers “Effective C++”. You’ll be doomed without it!

Shot_SPiotr_S.S · January 17, 2007, 8:06pm

Shot (Piotr S.) wrote:

My thoughts exactly. The catch is my supervisor codes in C++ all his
is that I’ll have to learn yet another language (and, more importantly,
design paradigm).

LuaJIT probably won’t be as fast as OCaml, but I think Lua
is easier for a Rubyist to learn.

Shot_SPiotr_S.S · January 18, 2007, 6:46pm

William J. wrote:

print(prime(274877906899))
print(prime(549755813881))
print(prime(1099511627689))
print(prime(2199023255531))
print(prime(4398046511093))
print(prime(8796093022151))
print(prime(17592186044399))
print(prime(953467954114363))
print(os.clock() - time)

A little less crude:

function report_primality( tbl )
for _, n in ipairs( tbl ) do
print( prime( n ) )
end
end

report_primality( { 2, 4, 7, 25 } )

print( ‘----- the rest are primes’ )
time = os.clock()

report_primality( {
1073676287,
68718952447,
87178291199,
274877906899,
549755813881,
1099511627689,
2199023255531,
4398046511093,
8796093022151,
17592186044399,
953467954114363 } )

print(os.clock() - time)

Ruby 87.781 seconds
Lua 4.188 seconds
LuaJIT 0.718 seconds

Shot_SPiotr_S.S · January 18, 2007, 7:23pm

On Jan 17, 2007, at 10:10 PM, William J. wrote:

end
3.step( Math.sqrt(n).floor, 2){|i|
if n % i == 0
return false
end
}
true
end

I wrote mine a bit different:

def prime(n)
return false if n > 2 and n % 2 == 0
max = sqrt(n)
3.upto(max) do |i|
return false if i % 2 != 0 and n % i == 0
end
return true
end

optimize :prime

now, optimize converts it to C which doesn’t have the benefits of big
numerics built in, but that doesn’t sound like a problem for the OP.
Here are my times:

pure ruby: real 0m1.139s user 0m1.110s sys 0m0.009s
opt ruby: real 0m0.259s user 0m0.201s sys 0m0.056s

You can stay in ruby-land and eat your cake too.

Shot_SPiotr_S.S · January 19, 2007, 4:30pm

Ryan D. wrote:

if n > 2 and n % 2 == 0
I wrote mine a bit different:
optimize :prime

now, optimize converts it to C which doesn’t have the benefits of big
numerics built in, but that doesn’t sound like a problem for the OP.
Here are my times:

pure ruby: real 0m1.139s user 0m1.110s sys 0m0.009s
opt ruby: real 0m0.259s user 0m0.201s sys 0m0.056s

You’re saying that pure Ruby took less than 2 seconds?
I don’t see how that is possible. On the faster of the two
computers at my disposal your version ran in 207 seconds.

Shot_SPiotr_S.S · January 19, 2007, 4:30pm

On 1/18/07, Ryan D. [email protected] wrote:

the Ruby program

end
end

optimize :prime

Where does this call to optimize come from?

now, optimize converts it to C which doesn’t have the benefits of big
numerics built in, but that doesn’t sound like a problem for the OP.
Here are my times:

Am I the only one feeling like they’re missing out on something
interesting?

pure ruby: real 0m1.139s user 0m1.110s sys 0m0.009s

opt ruby: real 0m0.259s user 0m0.201s sys 0m0.056s

You can stay in ruby-land and eat your cake too.

Thanks,
Michael G.

Shot_SPiotr_S.S · January 19, 2007, 7:18pm

Thanks a lot for your reply, Ryan!

// Also, as to not pollute the list with separate mails: thanks to
// everybody else who commented in this thread! I learned more in the
// past two days than I hoped to learn 'till the end of January!
// Unfortunately, it might mean I’ll come back with more questions…

Ryan D.:

On Jan 17, 2007, at 2:00 AM, Shot (Piotr S.) wrote:

symbolic functional decomposition of FSMs for FPGA implementation

Some work has been done in this area to some extent. Check
out RHDL from Phil T… I believe it is pure ruby only.

Thanks, again! I’m already eager to drill into his Ruby-interpretation
of an FSM, and it’s not even weekend here yet…

my gut feeling is that pure Ruby will end up
being way too slow and memory-inefficient.

Gut feelings are often wrong. Especially in this arena.

I hope this one is. Still, my supervisor’s C++ classes are quite
optimised, and it’s not rare for a decomposition to run for two days
straight; hopefully the benchmarks needed to prove the scientific value
of my thesis are of the less-time-consuming kind.

You can also take advantage of very easy
parallelism available through systems like rinda.

Thanks for pointing out Rinda, I’ll definitely take a closer look.
Unfortunately, decomposition is a highly iterative process, so I doubt
it’s at all paralellable (some parts of a single iteration might be,
though).

Ruby with C extensions.

This is the approach I choose most of the time. Just Do It.

I talked with my supervisor, and he said that if going with Ruby means
I’ll show up with anything that actually works even just a bit sooner,
then he’s all for it.

Thanks a lot for pointers to rubyprof and zenprofiler!

– Shot, currently in the ‘Ruby or Lua? Ruby or Lua?’ mode…

Shot_SPiotr_S.S · January 19, 2007, 7:36pm

On Sat, 20 Jan 2007, Shot (Piotr S.) wrote:

Thanks for pointing out Rinda, I’ll definitely take a closer look.
Unfortunately, decomposition is a highly iterative process, so I doubt
it’s at all paralellable (some parts of a single iteration might be,
though).

if you can make it parrallel you may also want to check out ruby queue

Linux Clustering with Ruby Queue: Small Is Beautiful | Linux Journal
http://www.codeforpeople.com/lib/ruby/rq/

-a

Shot_SPiotr_S.S · April 23, 2007, 10:00pm

Shot,

So far the discussion has been very rational and focused on technical
merits. It’s also worth considering enlightened self-interest:

If you are planning on working in quantitative finance then you
should use C++ because that is the language of choice for
quantitative finance. If you expect to work in non-telecoms IT then
you should use Java because that will be more valuable.

If neither are issue you should certainly use Ruby because it will be
more productive - the time you spend tuning Ruby is probably less
than the time you would spend waiting around for C++ builds to finish
and chasing down pointer bugs and mysterious crashes. And if the time
isn’t less it will at the very least be more fun!

-Peter

On Jan 19, 2007, at 1:17 PM, Shot (Piotr S.) wrote:

On Jan 17, 2007, at 2:00 AM, Shot (Piotr S.) wrote:

being way too slow and memory-inefficient.
parallelism available through systems like rinda.
I talked with my supervisor, and he said that if going with Ruby means
your foot, your choice" memory model. – jtv, LKML

Peter B.
[email protected]
917 445 5663

Shot_SPiotr_S.S · February 17, 2007, 1:12am

On Jan 18, 2007, at 4:24 PM, Michael G. wrote:

optimize :prime

Where does this call to optimize come from?

From my fingers? I think what you mean to ask is where is optimize.
It is provided by my package ZenHacks in a not terribly usable form.

Am I the only one feeling like they’re missing out on something
interesting?

Yes, everyone else went to rubyconf 2005 and saw it on their own.

Shot_SPiotr_S.S · September 25, 2007, 10:29pm

On Jan 18, 2007, at 11:10 AM, William J. wrote:

end

optimize :prime

You’re saying that pure Ruby took less than 2 seconds?
I don’t see how that is possible. On the faster of the two
computers at my disposal your version ran in 207 seconds.

Yes, on my computer it ran in 2 seconds, but I didn’t have the same
driver as you did because I couldn’t go into the bignum range. So I
did the biggest you had an extra 20 times instead.