Forum: Ruby C DSL anyone?

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
Cf8586cf56fec72eb20a52e6797fea31?d=identicon&s=25 Brad Phelan (Guest)
on 2007-05-02 18:15
(Received via mailing list)
Just curious,

is anybody working on a C language DSL that could generate C code. I'm
kinda interested in how it may be possible to use RSpec to unit test C
code in a nice way.

My first attempt at a C DSL allows the following example.


require 'cexpression.rb'

code = CTools::CCode.new

code = CTools::CCode.open do

    _decl :a, :ubyte2
    _decl :b, :ubyte2
    _decl :c, :ubyte2

    _typedef :Point do
       _struct do
          _decl :x, :ubyte4
          _decl :y, :ubyte2
          _decl :z, :ubyte1
       end
    end

    _decl :point, :Point

    _for(a.assign(0), a < 1, a.inc(1)) do
       _while(b < 1) do
          _for(c.assign(0), c < 1, c.inc(1)) do
             _let b, b + 1
             _printf "%d %d %d", point.x, point.y, point.z
          end
       end
    end
end

puts code

# generates

ubyte2 a;
ubyte2 b;
ubyte2 c;
typedef struct {
    ubyte4 x;
    ubyte2 y;
    ubyte1 z;
}Point;
Point point;
for (a = 0; a < 1; a += 1){
    while (b < 1){
       for (c = 0; c < 1; c += 1){
          b = b + 1
          printf("%d %d %d", point.x, point.y, point.z );
       }
    }
}


===========================
library cexpression.rb is quite simple
===========================

module CTools

    # Indent a block of C code

    def CTools.c_indent(text)
       indent_text = "
           "
       indent = 0;
       cont = false;
       out_text = []
       text.each_line do |line|
          line.gsub!(/^\s*/,"")
          if line =~ /\{.*\}/
             line.gsub!(/^/,indent_text[1..indent])
          else
             if line =~ /\}/
                indent = indent - 3
             end
             line.gsub!(/^/,indent_text[1..indent])
             if line =~ /\{/
                indent = indent + 3
             end
             # Convert "/**/" into blank lines
             line.gsub!(/^\s*\/\*\*\//,'')
          end
          # Indent on unmatched round brackets
          indent = indent + (  line.count("(") - line.count(")") ) * 3
          # Indent on backslash continuation
          if cont
             if line !~ /\\$/
                indent = indent - 6
                cont = false
             end
          else
             if line =~ /\\$/
                indent = indent + 6
                cont = true
             end
          end
          out_text << line
       end
       out_text.join
    end

    class CExpr

       # Operator
       attr_reader :op

       # Arguments ( Sub expressions )
       attr_reader :args

       def initialize(*args)
          @args = args
       end

       def to_s
         "#{@op}(" + args.collect { |a| a.to_s }.join(', ') + ")"
       end

       ##### Operators and Calls ##########


       def assign(val)
          method_missing "=", val
       end

       def inc( val )
          method_missing "+=", val
       end

       def decr( val )
          method_missing "-=", val
       end

       def method_missing(meth_id, *args)
          case meth_id.to_s
          when '=', '+=', '-=', '>=', '<=', '<', '>', '+', '-', '*', '/'
             BinOp.new(meth_id, self, *args)
          when '[]'
             ArrayOp.new(self, *args)
          else
             BinOp.new(".", self, CExpr.new(meth_id, *args))
          end
       end

    end

    class BinOp < CExpr
       def initialize(op, *args)
          @op = op
          @args = args
          if args.length != 2
             raise :BadNumberOfArgs
          else
          end

       end

       def to_s
          case @op
          when '.'
            "(#{args[0]}.#{CTools.debracket(args[1].to_s)})"
          else
            "(#{args[0]} #{@op} #{args[1]})"
          end
       end
    end

    class ArrayOp < CExpr
       def initialize(op, *args)
          @op = op
          @args = args
       end

       def to_s
         "#{op}[" + args.join(', ') + "]"
       end
    end


    class CVar < CExpr
       attr :name
       attr :type

       def initialize(name, type)
          @name = name;
          @type = type;
       end

       def decl
          "#{type} #{name};"
       end

       def to_s
          name.to_s
       end

    end

    def self.debracket(str)
          (str.gsub(/^\(/,'')).gsub(/\)$/,'')
    end

    class BlankSlate
        instance_methods.each { |m|
           case m
           when /^__/
           when /instance_eval/
           else
              undef_method m
           end
        }
    end

    class CCode
       private

       def initialize
          @buffer = ""
       end

       def new
       end

       public

       def self.open &block
          code = CCode.new
          code.instance_eval &block
          code.to_s
       end

       def method_missing(meth, *args)
          @buffer << meth.to_s.gsub(/^_/,'') << "(" << args.collect{ |a|
             case a
             when String
                # Literal strings are output quoted
                '"' + a + '"'
             else
                # Other objects are considered names
                CTools.debracket(a.to_s)
             end
          }.join(', ') << " );\n"
       end


       def scope( lf=true)
          @buffer << "{\n"
          yield
          @buffer << "}"
          @buffer << "\n" if lf
       end

       def <<( e )

          s = CTools::debracket(e.to_s)

          @buffer << s << ";\n"
       end

       def _if(cond, &block)
          @buffer << "if (#{cond})"
          scope &block
       end

       def _else(&block)
          @buffer << "else"
          scope &block
       end

       def _let(a, b)
          @buffer << "#{a} = #{CTools.debracket(b.to_s)}\n"
       end

       def _for(init, cond, inc, &block)
          init = CTools.debracket(init.to_s)
          cond = CTools.debracket(cond.to_s)
          inc  = CTools.debracket(inc.to_s)

          @buffer << "for (#{init}; #{cond}; #{inc})"
          scope &block
       end

       def _while(cond, &block)
          cond = CTools.debracket(cond.to_s)
          @buffer << "while (#{cond})"
          scope &block
       end

       def _typedef name, &block
          @buffer << "typedef "
          yield
          @buffer << name.to_s << ";\n"
       end

       def _struct (name="", &block)
          @buffer << "struct #{name}"
          # Evaluate the struct declarations in a new
          # scope
          @buffer << CTools::CCode.open do
             scope false do
                instance_eval &block
             end
          end
          @buffer << ";\n" if name != ""
       end

       # Declare a variable in scope and
       # add an instance method to retrieve
       # the symbol
       def _decl name, type="void *"
          var = CVar.new(name, type)
          @buffer << var.decl << "\n"
          self.class.send(:define_method, name){ var }
       end

       def to_s
          CTools.c_indent @buffer
       end
    end
end
Cb48ca5059faf7409a5ab3745a964696?d=identicon&s=25 unknown (Guest)
on 2007-05-02 19:00
(Received via mailing list)
On Thu, 3 May 2007, Brad Phelan wrote:

> Just curious,
>
> is anybody working on a C language DSL that could generate C code. I'm kinda
> interested in how it may be possible to use RSpec to unit test C
> code in a nice way.
>

check out RubyInline - it is exactly this

-a
47b1910084592eb77a032bc7d8d1a84e?d=identicon&s=25 Joel VanderWerf (Guest)
on 2007-05-02 20:26
(Received via mailing list)
Brad Phelan wrote:
> Just curious,
>
> is anybody working on a C language DSL that could generate C code. I'm
> kinda interested in how it may be possible to use RSpec to unit test C
> code in a nice way.

Here's one:

http://raa.ruby-lang.org/project/cgenerator/
http://rb-cgen.darwinports.com/

There's an example at:

http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/...

IMO, cgen's shadow class mechanism is what differentiates cgen from
RubyInline. (Shadow classes give you an easy way of defining and
inheriting T_DATA structs as if they were just part of your ruby
classes, using the shadow_attr_accessor methods.) RubyInline is probably
more sophisticated in many ways (compiler options, storing intermediate
code securely, availability as a gem).

I've been using cgen since 2001 for boosting the performance of
numerical integration and simulations. I've also used it to wrap some
libraries that I didn't want to bother with swig for.
7b4707f974af261f71943e1f2046c9ee?d=identicon&s=25 SonOfLilit (Guest)
on 2007-05-03 09:20
(Received via mailing list)
Am I the only one that thinks OP is looking for a library to assist in
generating ascii C code, like Markaby does for HTML, and not for
executing C code that you wrote as a string?

Brad, I don't think you'll find one, and in fact, I don't think you'll
need one. Why? C has so much syntax that you're better off generating
it with string manipulation than with a DSL.

HTML is so simple that there was more gain (succintness,
automatability) than loss (new language to learn) in Markaby and it's
neighbours. With C, I don't see such gains overcoming the loss.

Aur
7f891fbe8e3bae7f9fe375407ce90d9d?d=identicon&s=25 Harold Hausman (Guest)
on 2007-05-03 10:07
(Received via mailing list)
On 5/3/07, Brad Phelan <phelan@tttech.ttt> wrote:
> Just curious,
>
> is anybody working on a C language DSL that could generate C code. I'm
> kinda interested in how it may be possible to use RSpec to unit test C
> code in a nice way.
>

I just read an interview that contained some of your keywords there:
"... I'm actually working on a tool right now I call cuby. It's a
dialect of ruby that generates C code directly."

Here's the link:
http://on-ruby.blogspot.com/2006/12/rubinius-interview.html

It appears to be all jumbled up in a re-implementation of Ruby, but
might be interesting for you to check out.

hth,
-Harold
1c0cd550766a3ee3e4a9c495926e4603?d=identicon&s=25 John Joyce (Guest)
on 2007-05-03 10:18
(Received via mailing list)
On May 3, 2007, at 4:18 PM, SonOfLilit wrote:

> neighbours. With C, I don't see such gains overcoming the loss.
>
> Aur
>
Indeed, the OP is trying to generate C rather than write C. Very
understandable. Certainly doable. It just means a lot less of the
flexibility that C usually has (for good and bad).
But actually it is very possible to generate corresponding looping
structures, structs, functions and more.
Need to generate declarations of many variables.
To get started, find a C file that follows the conventions you want
to have generated. Rome was not built in a day and no generator will
be either.
In theory all languages end up the same in ASM anyway.
The main problem is that it must be a DSL because some Ruby
mechanisms wouldn't translate so smoothly to C.
They'd be more like Objective-C.
That said, there's a lot of typing that could be saved.
Perhaps the most painful things would be dealing with pointers and
memory management.
Fe6a008c1e3065327d1f1b007d8f1362?d=identicon&s=25 Paul Brannan (cout)
on 2007-05-03 16:23
(Received via mailing list)
I wrote a tool a while back I called rubypp (pp as in preprocessor),
which lets you do something like this:

#include <iostream>

#ruby <<END
  puts '#define FOO BAR'
  '#define BAR BAZ'
END

#ruby def foo(x)              ; \
        x.to_s.chop           ; \
      end

extern "C" {
int foo(int a) {
  std::cout << "1" << std::endl;
}

int foo(double a) {
  std::cout << "2" << std::endl;
}

main() {
  foo(1);
  foo(1.0);
  std::cout << "#{foo(1.0)}" << std::endl;
}

which produces this output:

#include <iostream>

#define FOO BAR
#define BAR BAZ


extern "C" {
int foo(int a) {
  std::cout << "1" << std::endl;
}

int foo(double a) {
  std::cout << "2" << std::endl;
}

main() {
  foo(1);
  foo(1.0);
  std::cout << "1." << std::endl;
}

The syntax is a little odd, but it's surprisingly powerful.  I use it to
generate code for nodewrap.  You can find it at:

http://rubystuff.org/rubypp/rubypp.rb

Paul
47b1910084592eb77a032bc7d8d1a84e?d=identicon&s=25 Joel VanderWerf (Guest)
on 2007-05-03 18:03
(Received via mailing list)
Paul Brannan wrote:
> I wrote a tool a while back I called rubypp (pp as in preprocessor),
> which lets you do something like this:
...
> #ruby def foo(x)              ; \
>         x.to_s.chop           ; \
>       end

How is this def being used in the output?
Fe6a008c1e3065327d1f1b007d8f1362?d=identicon&s=25 Paul Brannan (cout)
on 2007-05-03 18:37
(Received via mailing list)
On Fri, May 04, 2007 at 01:03:20AM +0900, Joel VanderWerf wrote:
> Paul Brannan wrote:
> >I wrote a tool a while back I called rubypp (pp as in preprocessor),
> >which lets you do something like this:
> ...
> >#ruby def foo(x)              ; \
> >        x.to_s.chop           ; \
> >      end
>
> How is this def being used in the output?

I wish I had a more realisitc example; I think then it would be clearer.

  std::cout << "#{foo(1.0)}" << std::endl;

  std::cout << "1." << std::endl;

Paul
47b1910084592eb77a032bc7d8d1a84e?d=identicon&s=25 Joel VanderWerf (Guest)
on 2007-05-03 21:20
(Received via mailing list)
Paul Brannan wrote:
> I wish I had a more realisitc example; I think then it would be clearer.
>
>   std::cout << "#{foo(1.0)}" << std::endl;
>
>   std::cout << "1." << std::endl;

My bad. I just missed that. So the #ruby stuff is for defining utility
functions that can be used inside the C code templates?
481b8eedcc884289756246e12d1869c1?d=identicon&s=25 Francis Cianfrocca (blackhedd)
on 2007-05-04 00:13
(Received via mailing list)
On 5/3/07, SonOfLilit <sonoflilit@gmail.com> wrote:
> automatability) than loss (new language to learn) in Markaby and it's
> neighbours. With C, I don't see such gains overcoming the loss.



I've written compilers for high-level languages before that targeted C
rather than ASM. It's not a bad approach, if you understand how C is
optimized. It's possible to generate C that will compile to something
pretty
fast. Although these days, memory-bus bandwidth is a much more
constrained
resource than it ever was in the past (mostly because everything else
has
gotten so much faster), and that adds a level of complexity.

Is the point of this to get better performance? Not if you keep the
essential Ruby features (open classes and all the rest).  Is the point
to
save typing? Maybe, but I've always found that the vast majority of the
time
spent in writing C goes not into typing but into either planning or
debugging. (The more planning you do, the less debugging, and vice
versa.)
Fe6a008c1e3065327d1f1b007d8f1362?d=identicon&s=25 Paul Brannan (cout)
on 2007-05-04 04:16
(Received via mailing list)
On Fri, May 04, 2007 at 04:18:48AM +0900, Joel VanderWerf wrote:
> My bad. I just missed that. So the #ruby stuff is for defining utility
> functions that can be used inside the C code templates?

You can use them that way, or if the code returns non-nil, the result
will be converted to a string and inserted in the output stream, or
anything sent to stdout will also be included in the preprocessed
output.

Paul
F8db1aa9a5989dda731f9005dbf4a8b8?d=identicon&s=25 Lauri Pesonen (Guest)
on 2007-05-05 17:25
(Received via mailing list)
Related to what the OP was asking for, I've been thinking of
implementing a DSL in Ruby for defining message types for a C-based
distributed framework. We're building a distributed system that
consists of tasks sending each other messages. The messages have types
and payloads.

As things are now, we have to define message types by hand. This
includes defining a payload type, which might be as simple as a single
int, a struct consisting of primitive types, or a hierarchical struct
containing dynamic sized variables. In addition to the payload type we
have to implement functions for marshalling/unmarshalling the payload
and to print the contents out as text.

Writing all this for simple payload types is relatively painless, but
boring. Writing all this for complex hierarchical payloads with
dynamic sized variables is painful, boring, and extremely error prone,
because it usually involves a lot of copy-pasting.

What I have at the moment is something like this (this is from memory,
because I don't have the code at hand):

messages "c-file-basename" do |msgs|
  msgs.define_message "message_type_name" do |m|
    m.add_member :uin32_t, "uint32_t_variable_name"
    m.add_pointer :uint8_t, "uint8_t_pointer_name"
  end

# define other message types that will be included in the same C-file.
# ...
end

This would result in something like the C-code at the end of the email.

The difficulties come with more complicated messages. What if I have a
struct used elsewhere in the system that should be a part of many
different message types? For example, we describe task ids with
structs and these are passed from task to task in message payloads
quite often. One solution is to add it as a basic type to the DSL.
That is, in the same way that the DSL understands what an uint32_t is
and how to marshall and print an uint32_t, I enhance it to understand
what the struct is. The definition of the struct would be in the
common header files of the system. The down side is that whenever
someone defines a new complex type, we must implement that type in the
DSL. Me being the only Rubyist in the project would mean that I'll end
up doing the enhancing ;-)

Another approach would be to enhance the DSL so that you can import
other files. I.e. we could define common structs in files that are
then imported to the message definitions. This to me seems overly
complicated in our case. I think it would make the DSL a lot more
complicated to implement and I'm not sure our use cases require the
functionality. We can always implement the more complex messages by
hand if the DSL is not able to handle them.

I'd appreciate any kind of input on this that the list members might
have. I've read as much about DSLs in Ruby as I could find on the web,
but none of them really covered what I'm trying to do.

-- Lauri

-- generated C-code --

/ * I wrote this on the fly to this email, so it will most likely
contains errors */

typedef struct MESSAGE_TYPE_NAME_ {
  uint32_t uint32_t_variable_name;
  uint32_t uint8_t_pointer_name_len;
  uint8_t* uint8_t_pointer_name;
} MESSAGE_TYPE_NAME

uint32_t message_type_name_marshall(void* msg, void* buf, uint32_t
buf_len) {
  MESSAGE_TYPE_NAME* my_msg = (MESSAGE_TYPE_NAME*) msg;
  uint8_t* ptr = buf;

  if (buf_len < (sizeof(uint32_t) + sizeof(uint8_t) *
my_msg->uint8_t_pointer_name_len)) {
    return 0;
  }

  memcpy(ptr, &my_msg->uint32_t_variable_name, sizeof(uint32_t);
  ptr += sizeof(uint32_t);

  memcpy(buf, my_msg->uint8_t_pointer_name,
my_msg->uint8_t_pointer_name_len);
  ptr += my_msg->uint8_t_pointer_name_len;

  return (ptr - buf);
}

/* And similar unmarshall and to_text functions */

MSG_TYPE_DEF message_type_name_type = {
  message_type_name_marshall,
  message_type_name_unmarshall,
  message_type_name_to_text
};
753dcb78b3a3651127665da4bed3c782?d=identicon&s=25 Brian Candler (Guest)
on 2007-05-06 16:18
(Received via mailing list)
On Sun, May 06, 2007 at 12:23:38AM +0900, Lauri Pesonen wrote:
> have to implement functions for marshalling/unmarshalling the payload
> messages "c-file-basename" do |msgs|
>
> The difficulties come with more complicated messages. What if I have a
> struct used elsewhere in the system that should be a part of many
> different message types?

The thought which struck me when reading this was: "ASN.1"

OK, it's horrible, but it does pretty much exactly what you ask, and has
its
own (standard) DSL for describing the message formats. So if you could
find
a good C library which reads ASN.1 and outputs code to parse messages in
BER/DER format, maybe that would be an alternative solution.

The standards documentation is comprehensive, if not easy to read:
http://www.itu.int/ITU-T/studygroups/com17/languag...
http://www.itu.int/ITU-T/studygroups/com10/languag...

And of course there are probably books and other resources.

There might be Ruby libraries for handling ASN.1 directly. The only one
I
know of is the one built into openssl, which is low-level but
functional. I
used it in ruby-ldapserver, which you can find on rubyforge.org.

Regards,

Brian.
F8db1aa9a5989dda731f9005dbf4a8b8?d=identicon&s=25 Lauri Pesonen (Guest)
on 2007-05-07 14:24
(Received via mailing list)
On 06/05/07, Brian Candler <B.Candler@pobox.com> wrote:

> > The difficulties come with more complicated messages. What if I have a
> > struct used elsewhere in the system that should be a part of many
> > different message types?
>
> The thought which struck me when reading this was: "ASN.1"
>
> OK, it's horrible, but it does pretty much exactly what you ask, and has its
> own (standard) DSL for describing the message formats. So if you could find
> a good C library which reads ASN.1 and outputs code to parse messages in
> BER/DER format, maybe that would be an alternative solution.

You do have a point. I agree that ASN.1 would be able to handle all
possible message payloads.

I've never used ASN.1 personally and my impressions of it are that it
is, as you say, horrible. I laso feel that it is overkill in this
case. I mean that I don't need a general solution to my problem that
is capable of describing all possible message types. I'm trying to
make it less painful for the other developers in the team to create
new message types and as far as I can tell most of the message types
we'll have will be very simple, flat structures. We'll occasionally
come across a more complicated messages, but those can be implemented
by hand if need be.

In many cases we do not need the marshalling and to_text functions:
to_text is used only for logging, and marshalling is only necessary if
the message is crosses process boundaries. Never the less it would be
nice to have these functions for all message types, because i) it nice
to get human readable log messages, and ii) having mashalling
functions available allows us to move components from one process to
another almost transparently without having to worry about breaking
the messaging.

> The standards documentation is comprehensive, if not easy to read:
> http://www.itu.int/ITU-T/studygroups/com17/languag...
> http://www.itu.int/ITU-T/studygroups/com10/languag...

Thanks for all the links. I'm hoping that I can avoid using ASN.1. On
the other hand we're doing SNMP as well, so I'll probably have to get
my hands dirty at some point.

I'll chug along with my approach and I'll report back if I can make it
work.
This topic is locked and can not be replied to.