Automatic code conversion from Ruby to C?

Dear all,

I am looking for a (semi-)automatic conversion of a Ruby script to C or
C++, so
that readable source code for a human C/C++ programmer is produced.

The program should read in text from a file as a string, split and
search it using regular expressions,
convert some numbers into floats and perform some calculations on them
(the latter part is done using some existing C code).

The text file to be read has this structure:


P1
P2 23
P4 27

P2
P3 1
P5 457
P6 3
P377 56

Thus, P1 is a starting point and the distance to point P2 is 23 etc.
There is a varying number of entries in each of the parts separated by
dashes.
Is there some gem that would allow me to write a nice short
Ruby script doing the above-mentioned manipulations and translate it for
me into
readable C/C++, such that I’d have to specify as few as possible size
and variable
type declarations ?
I have about a hundred thousand points and relative distances, so speed
could matter here …
and I’d have a hard time defending the use of a scripting language with
the people
I am collaborating here.

Thank you very much!

Best regards,

Axel

Axel E. wrote:

I am looking for a (semi-)automatic conversion of a Ruby script to C or
C++, so
that readable source code for a human C/C++ programmer is produced.

The program should read in text from a file as a string, split and
search it using regular expressions,
convert some numbers into floats and perform some calculations on them
(the latter part is done using some existing C code).

It sounds like you think that you can use Ruby as a shorthand way of
writing C. Unfortunately it doesn’t work that way.

Bear in mind that the biggest pain with C is to do with memory
allocation. Ruby has a whole run-time environment for creating and
manipulating Ruby objects. So this mechanically-generated C code would
just be calling Ruby libraries to create objects or dispatch methods.

So in that case, why not just write it in Ruby and run it?

The other part you say is that you want to interact with existing C
code. This may be easier than you think. A good starting point is
http://www.ruby-doc.org/docs/ProgrammingRuby/html/ext_ruby.html

Look also at the RubyInline gem, which lets you write C directly inside
your Ruby (yes!)

Is there some gem that would … translate it for
me into
readable C/C++, such that I’d have to specify as few as possible size
and variable
type declarations ?

If it did, it would just be

VALUE foo;
VALUE bar;
VALUE baz;

/* VALUE is a reference to a Ruby object, whether it be an Array, a
String, or an integer */

and I’d have a hard time defending the use of a scripting language with
the people
I am collaborating here.

Prototype it in Ruby. If it works fast enough, then you’ve saved
yourself a lot of work. Even if it’s too slow, you’ll have a more
concrete idea what you’re trying to do and how to achieve it.

Reading 100,000 lines in Ruby doesn’t take very long.

Another option would be to write a Ruby script to “normalise” your input
into a format which is easier for your C program to read. For example,
it could output

point1 point2 distance

tuples. This would probably be sufficient:

src_point = nil
while line = $stdin.gets
line.chomp!
if line =~ /^-/
src_point = nil
elsif src_point.nil?
src_point = line
elsif line =~ /^(\S+)\s*(\d+)$/
puts “#{src_point} #{$1} #{$2}”
else
STDERR.puts “Invalid line: #{line.inspect}”
end
end

Output:

P1 P2 23
P1 P4 27
P2 P3 1
P2 P5 457
P2 P6 3
P2 P377 56

Your C program would still need to build a graph data structure from
this though. There are probably existing C libraries to work on data
sets like this, if you hunt hard enough.

Why don’t you just use Bison/Flex for that task?

As far as i know, most Ruby Parser Generators compile to Ruby. (for
example DHAKA)
The resulting (compiled) statemachine is not very readable.

And even if you write the parser by hand, I don’t have high hopes that
a translation to “readable” C-Code
is easily possible.[1]

So if speed is an issue, don’t code it by hand but code it with a tool
that is right for the job. Thats neither C
nor Ruby.
If speed is not that much of an issue, write you parser in one of the
available libraries and measure wether
it fits your needs. Thats the only way to find out.

Regards,
Florian G.

[1]: Whats that anyway? :wink:

Axel E. wrote:


Hi Axel,

My recommendation would be to ‘prototype’ in Ruby to establish the
correct logic of what you want to achieve and then recode in C/C++ if
speed is not fast enough - that’s what we do for a number of things on
our side.

If you don’t use too many Ruby-isms, your Ruby code can look a lot like
the corresponding C code - of course, you’d have to rewrite the string
parsing in C/C++ to suit your needs.

Cheers
Mohit.

2008/10/27 Robert K. [email protected]:

On 27.10.2008 20:44, Axel E. wrote:

I am looking for a (semi-)automatic conversion of a Ruby script to C or
C++, so
that readable source code for a human C/C++ programmer is produced.

I have about a hundred thousand points and relative distances, so speed
could matter here …
and I’d have a hard time defending the use of a scripting language with
the people
I am collaborating here.

What do you want to do with the data afterwards? Maybe you can just add the
missing code and benchmark (your writing time as well as runtime).

OTOH, maybe Ruby is just not the right tool for the job. IMHO there is
no point in forcing Ruby into this if C or C++ do fit the problem at
hand much better.

Kind regards

robert

On 27.10.2008 20:44, Axel E. wrote:

I am looking for a (semi-)automatic conversion of a Ruby script to C or C++, so
that readable source code for a human C/C++ programmer is produced.

Hm, automatic conversion tends to produce not so readable code.


Is there some gem that would allow me to write a nice short
Ruby script doing the above-mentioned manipulations and translate it for me into
readable C/C++, such that I’d have to specify as few as possible size and variable
type declarations ?

So you basically want a script like this converted:

graph = []
p = nil

ARGF.each do |line|
line.chomp!
case line
when /^-+$/
p = nil
when /^P(\d+)$/
p = $1.to_i
graph[p] = {}
when /^P(\d+)\s+(\d+)$/
graph[p][$1.to_i] = $2.to_i
else
raise “Cannot parse: #{line.inspect}”
end
end

This should be similar easy with regular expressions in C/C++ or even
parsing by hand.

I have about a hundred thousand points and relative distances, so speed could matter here …
and I’d have a hard time defending the use of a scripting language with the people
I am collaborating here.

What do you want to do with the data afterwards? Maybe you can just add
the missing code and benchmark (your writing time as well as runtime).

Kind regards

robert

On 28.10.2008 21:00, Axel E. wrote:

I looked at RubyToC and at cplus2ruby, but it seems that I’d spend almost as much time writing
C from scratch as editing the translated code.

Indeed: the problem is simple enough to directly implement it in C* - no
need to push Ruby into this IMHO.

Kind regards

robert

Dear Florian, Brian, Robert and Mohit,

thank you all for your responses.
Actually, choosing what the code I’ll have to deliver is written in has
to be C/C++, since the
customers here want something they can feed into gcc and understand when
they read it.
I am not absolutely sure, but I don’t think they know Ruby very well and
probably aren’t willing to consider its
use in this problem, because of the speed issue.

I was just wondering whether there’d be some tool that would translate
some of my Ruby
code into C/C++ to speed up the development process.
I looked at RubyToC and at cplus2ruby, but it seems that I’d spend
almost as much time writing
C from scratch as editing the translated code. (This might be due to the
fact that I couldn’t find more
documentation than the examples provided, and there doesn’t seem to be a
way of translating Ruby’s
Hashes to C or its IO routines etc… with these softwares).

Please correct me if the above isn’t true …

Thank you very much again!

Best regards,

Axel

Hi Axel,

Axel E. wrote:

thank you all for your responses.
Actually, choosing what the code I’ll have to deliver is written in has to be C/C++, since the
customers here want something they can feed into gcc and understand when they read it.
I am not absolutely sure, but I don’t think they know Ruby very well and probably aren’t willing to consider its
use in this problem, because of the speed issue.

I find that sometimes customers don’t appreciate a new tool/ device -
they like to stay with what they know best… which is fine. My advice
to you would be to prototype in Ruby if there is an element of
uncertainty in the program you’re developing. Once the customer sees it
working, they may be convinced. On the other hand, once it’s already in
Ruby, translating to C/ C++ isn’t that far a step… even if you do it
manually. The only problem is that you need to maintain updates by hand
(or update only one or the other).

I was just wondering whether there’d be some tool that would translate some of my Ruby
code into C/C++ to speed up the development process.

I haven’t used anything that would help.

I looked at RubyToC and at cplus2ruby, but it seems that I’d spend almost as much time writing
C from scratch as editing the translated code. (This might be due to the fact that I couldn’t find more
documentation than the examples provided, and there doesn’t seem to be a way of translating Ruby’s
Hashes to C or its IO routines etc… with these softwares).
If you’re using C++, hashes can be translated into STL Maps if you need
it… though anything C/C++ is going to be more strongly typed than the
duck typing in Ruby.

Cheers,
Mohit.
10/29/2008 | 10:47 AM.

2008/10/29 Mohit S. [email protected]:

like to stay with what they know best… which is fine. My advice to you
would be to prototype in Ruby if there is an element of uncertainty in the
program you’re developing. Once the customer sees it working, they may be
convinced. On the other hand, once it’s already in Ruby, translating to C/
C++ isn’t that far a step… even if you do it manually. The only problem
is that you need to maintain updates by hand (or update only one or the
other).

Given the size of the problem (i.e. small) and the fact that existing
C library code has to be used I would directly do this in C or C++.
There is also the advantage of not having a hybrid solution which
makes certain things easier (building, distribution, code protection
etc.). IMHO implementing the parsing of such a simple language
(“P1…”) is not worthwhile doing this in Ruby. It’s always about
tradeoffs and forcing Ruby (or any other tool for that matter) into a
project where it does not fit will not bring any advantage (rather the
opposite).

If you’re using C++, hashes can be translated into STL Maps if you need
it… though anything C/C++ is going to be more strongly typed than the duck
typing in Ruby.

Note though that Ruby is as strongly typed as C++ - the
difference is between “static” and “dynamic” typing.

Kind regards

robert

Robert K. wrote:

Note though that Ruby is as strongly typed as C++ - the
difference is between “static” and “dynamic” typing.

Point taken! :slight_smile:

Cheers,
Mohit.
10/29/2008 | 4:42 PM.

Axel E. wrote:

I was just wondering whether there’d be some tool that would translate
some of my Ruby
code into C/C++ to speed up the development process.

If it did, at best it would generate something like this:

/*
str = “hello”
str.upcase!
*/
VALUE str = rb_str_new2(“hello”);
rb_funcall(str, rb_intern(“upcase!”), 0);

That is, it would use entirely Ruby data structures, so the rest of your
C code would have to be written to use Ruby data structures too. So it’s
nothing like hand-written C.

See Programming Ruby: The Pragmatic Programmer's Guide

and there doesn’t seem to be a
way of translating Ruby’s Hashes to C

What’s a “C hash”? Perhaps you mean glib, or STL, or a zillion other
possible C/C++ libraries?

Ruby doesn’t use any of these, and Ruby’s hashes don’t map directly to
any of them (for example, coders can subclass Hash, or override methods
in it; and a Ruby Hash can contain mixed types for both keys and values)

Even simple things like “Integer” don’t map directly to C. In Ruby,
Integers are unbounded size, with automatic conversions from Fixnum to
Bignum. So something simple like

a = 0
loop { a = a + 1; puts a }

is not the same as

long a = 0;
while(1) { a += 1; printf ("%ld\n", a); }

Axel E. wrote:

I am not looking for swig or an extension of Ruby by C, but a way of
producing
C/C++ code from a Ruby script containing some magical part that saves
me the trouble of declaring all the variables, assigning memory etc and
all
these things that make writing C/C++ so tedious in comparison with Ruby

That’s one thing that makes Ruby fundamentally different from C/C++. It
has its own big library of useful data structures, and its own typing
system.

Ruby’s highly dynamic typing and method dispatch makes it poorly suited
to automatic code analysis (e.g. type inferencing) which you could use
with, say, Haskell.

Real-life Ruby programs make heavy use of this. Look at ActiveRecord; it
adds all sorts of methods dynamically to a class, depending on the
structure of the database at the time the program is run. What would the
generated C code for this look like?

Talking about Sexps ignores the distinction between syntax and
semantics. Of course it’s easy to get to

s(:block,
s(:lasgn, :str, s(:str, “hello”)),
s(:call, s(:lvar, :str), :upcase!, s(:arglist)))

But turning that into idiomatic C begs fundamental questions like
“what’s a String”? (Remember that Ruby strings are 8-bit clean, i.e. can
contain nulls, and are dynamically sized)

Once you decide that strlen, strcat and friends are unsuitable for the
general case, then you have to make a new data structure for strings,
and implement it in C.

Then you move onto Array where the problem gets worse: in many cases
Ruby arrays are heterogeneous collections, so your generated C code
would have to be able to hold objects of any Ruby type. Sooner or later
it becomes an array of VALUE pointers, which means you are using the
Ruby core library anyway.

*/
VALUE str = rb_str_new2(“hello”);
rb_funcall(str, rb_intern(“upcase!”), 0);

That is, it would use entirely Ruby data structures, so the rest of your
C code would have to be written to use Ruby data structures too. So it’s
nothing like hand-written C.

Dear Brian,

well, the examples in the RubyToC gem as well as the Cplus2ruby gem
suggested that maybe, C or C++ code could be produced from Ruby scripts
that are fed into them (for some subset of Ruby). But I have not been
able to translate my Ruby scripts
just like that into C using these softwares.
I am not looking for swig or an extension of Ruby by C, but a way of
producing
C/C++ code from a Ruby script containing some magical part that saves
me the trouble of declaring all the variables, assigning memory etc and
all
these things that make writing C/C++ so tedious in comparison with Ruby

For the project I am working on right now, this is not so dramatic.
I just wanted to know if I had missed something that would spare me
all the troubles of C/C++ development :wink:

What’s a “C hash”? Perhaps you mean glib, or STL, or a zillion other
possible C/C++ libraries?
Ruby doesn’t use any of these, and Ruby’s hashes don’t map directly to
any of them (for example, coders can subclass Hash, or override methods
in it; and a Ruby Hash can contain mixed types for both keys and values)

Even simple things like “Integer” don’t map directly to C.

I am of course aware that there are conceptual differences between
languages …
Maybe a conversion tool between programming languages would then need
to be able to capture some of these by analyzing a lot of parallel code
from
detailed guidebooks explaining how to program task X “idiomatically” in
both
languages.
In automated translation between natural languages, results used to be
poor
as long as individual words were translated – there is the famous joke
that
a 1950ies automated translation of “out of mind, out of sight” into
Russian and back
gave “invisible idiot”.
On the other hand, it is possible to summarize texts automatically and
return their
gist quite well using statistical methods, such as Latent Semantic
Analysis (there’s
the classifier gem in Ruby implementing this for English texts).
This will map count word occurrences in each sentence or paragraph of a
text
(after having removed the most frequent ones, that are likely to occur
in any text),
and perform some operations on the resulting matrix, so that sentences
can be
mapped to some object from linear algebra which allows to classify them
and
introduce distances between them.
If you now have a long text, say, a novel in English, and its
(human-made) translation
into Russian, it should be possible to extract some correspondence
between word groups, words,
expressions in English to those in Russian from this training set,
including several alternative
ways to express approximately the same idea in either language.
I am wondering whether it’s possible to do something similar between
programming languages
and what could be the connection between them (S-expressions?)
That’s of course more an academic question than a practical one, but if
someone had done it already,
and I’d be able to write C in Ruby with an automatic translator, I’d
still like to know :wink:
From all the responses I got, this doesn’t seem to be the case … and
I’d truly expect that to be
a huge undertaking as well.

Thank you to all who responded to my question!

Best regards,

Axel

On Thu, 30 Oct 2008, Axel E. wrote:

Axel E. wrote:

I am not looking for swig or an extension of Ruby by C, but a way of producing
C/C++ code from a Ruby script containing some magical part that saves
me the trouble of declaring all the variables, assigning memory etc and all
these things that make writing C/C++ so tedious in comparison with Ruby …

The attached is “dirty washing”: it can be used but it’s not really
fit for polite society, and, well, you get the gist. I’m not
maintaining it at the moment, and it’s rather tailored to what I
needed at the time (so chop out references to fsv_common.h for
example), but if your are wearing thick rubber gloves and a boiler
suit you might like to poke around in it to see if there’s stuff you
can use.

The concept behind it was to reduce declarations, by getting the
function
definitions to produce the prototypes, and thereby the associated header
files, so I don’t have to use the “What I tell you three times is true”
approach of C. So it’s essentially a macro processor, but more
Rubyesque,
and allows you to compute stuff in ruby to insert directly. See things
like “Code Generation in Action” and the website
http://www.codegeneration.net/
for things a lot less horrid than this.
It is also based around the idea that a “class” in C works like FILE*,
its a pointer to struct with associated functions that operate on that,
without people having to know how (except those who write such
functions).

Even simple things like “Integer” don’t map directly to C.

I am of course aware that there are conceptual differences between languages …
Maybe a conversion tool between programming languages would then need
to be able to capture some of these by analyzing a lot of parallel code from
detailed guidebooks explaining how to program task X “idiomatically” in both
languages.

The only thing that gets near that is the fabled “decompiler”, and I’ve
not seen one myself. I think this translator between computer languages
is going to be “Star Trek technology” for a while yet.

In automated translation between natural languages, results used to be poor
as long as individual words were translated – there is the famous joke that
a 1950ies automated translation of “out of mind, out of sight” into Russian and back
gave “invisible idiot”.

I heard it was “invisible insane”. :slight_smile:

On the other hand, it is possible to summarize texts automatically and return their
gist quite well using statistical methods, such as Latent Semantic Analysis (there’s

I’ve played with free software for this recently, and it seems to be
extremely
poor quality, usually missing the point entirely. I’d like to be shown
something that works well in this area.

I am wondering whether it’s possible to do something similar between programming languages
and what could be the connection between them (S-expressions?)

Human languages only have to be portable across humans, which are
basically
the same platform, with the same types of IO, memory constraints and
physical
architecture. Computer languages can vary widely, FORTH, Haskell,
Cobol…

Hugh

On Mon, Oct 27, 2008 at 2:44 PM, Axel E. [email protected] wrote:

P6 3
variable
type declarations ?
I have about a hundred thousand points and relative distances, so speed
could matter here …
and I’d have a hard time defending the use of a scripting language with the
people
I am collaborating here.

Have you tried Ruby2Cextension? It converts ruby to C code (ruby
extension). It handles a fairly large subset of ruby.

I use this with my Grammar project to generate very fast parsers - my
main
benchmark is on par with a pure C parser generator (Ragel). This is
about
10X faster than the pure ruby I generate, but this ruby is already
highly
optimized (already 50X faster than the popular Treetop parser).

I added a few plugins to Ruby2Cextension (in svn repository) recently
that
apply some optimizations (method lookup caching, inlining, direct
self-method calls, instance var lookup caching) that gave a 2X
performance
increase. But, some of these don’t work as expected if the ruby code is
too
dynamic (i.e. changing a method after it has been called in this
compiled
ruby code).