Rocaml: Ruby extensions in Objective Caml

tulkas · June 22, 2007, 1:37pm

rocaml allows you to write Ruby extensions in Objective Caml.

I never seem to manage to release things when I should, so here’s a
pre-release announcement to let you know about this so you can play with
it
before the actual release, which could take longer than necessary.

http://eigenclass.org/repos/rocaml/head/

Young as it is, rocaml is very usable and the generated extensions are
reliable, since they enforce type safety and handle exceptions both
in Ruby and OCaml (OCaml exceptions are passed to Ruby).

Developing Ruby extensions with rocaml is easier and more convenient
than
writing a plain old C extension since rocaml performs Ruby<->OCaml
conversions
for a wide range of types, including abstract types and arrays, tuples,
variants and records of values of any supported type (e.g. arrays of
arrays of
variants of tuples of …).

Making an extension with rocaml involves two steps:

implementing the desired functionality in Objective Caml, and
registering
the functions to be exported
(using Callback.register : string → 'a → unit)
creating the extconf.rb file (just modify the sample extconf.rb
distributed
with rocaml) defining the interface of your Objective Caml code.

** At no point is there any need to write a single line of C code when
**
** using rocaml.
**

The mandatory trivial example

Let’s create an extension with a ‘fib’ function.
Here’s the OCaml code:

let rec fib n = if n < 2 then 1 else fib (n-1) + fib (n-2)

let _ = Callback.register “Fib.fib” fib

Here’s the interface declaration in your extconf.rb:

Interface.generate(“fib”) do
def_module(“Fib”) do
fun “fib”, INT => INT
end
end

That’s it. Running extconf.rb will generate all the required wrappers
and
make will link them against your ml code, creating a normal Ruby
extension
that can be used simply with
require ‘fib’
p Fib.fib 10

Set of strings using a RB tree

Here’s a simple set based on an RB tree, specialized for strings (see
examples/tree for how to create several classes from a single
polymorphic
structure). The (unoptimized) RB tree takes only ~30LoCs, but lookup is
3X
faster than with RBTree, which takes >3000 lines and over ~250 lines for
the
equivalent functionality, without counting the manually written
wrappers for
the underlying C data structure.

This shows how rocaml handles complex types, including variant and
recursive
types.

Given this interface definition:

Interface.generate(“tree”) do
string_tree_t = sym_variant(“string_tree_t”) do |t|
constant :Empty
non_constant :Node, TUPLE(t, type, t)
end

def_class(“StringRBSet”) do |c|
t = c.abstract_type
fun “empty”, UNIT => t, :aliased_as => “new”
fun “make”, string_tree_t => t

method "add", [t, STRING] => t
method "mem", [t, STRING] => BOOL, :aliased_as => "include?"
method "dump", t => string_tree_t
method "iter", t => t, :aliased_as => "each", :yield => [STRING,

UNIT]
end
end

You can use the generated extension as follows (you can find the OCaml
code
below):

require ‘tree’
set = StringRBSet.new
set2 = s.add “foo” # the RB set is a functional, i.e. persistant
# data structure

see how rocaml handles conversions for recursive variant types

p s.add(“foo”).dump
p s.add(“foo”).add(“bar”).dump

The above will print
[:Node, [:B, :Empty, “foo”, :Empty]]
[:Node, [:B, [:Node, [:R, :Empty, “bar”, :Empty]], “foo”, :Empty]]

showing you the structure of the RB tree.

That’s it for now, enjoy.
Further updates on eigenclass.org.

PS:
For the sake of completeness, here’s the OCaml code. You can find the
full
example in examples/tree.

exception Found

module RBSet =
struct
type color = R | B
type 'a t = Empty | Node of color * 'a t * 'a * 'a t

let empty = Empty

let rec mem x = function
Empty → false
| Node(_, l, y, r) →
if y < x then mem x l else if y > x then mem x r else true

let balance = function
B, Node(R, Node(R, a, x, b), y, c), z, d
| B, Node(R, a, x, Node(R, b, y, c)), z, d
| B, a, x, Node(R, Node(R, b, y, c), z, d)
| B, a, x, Node(R, b, y, Node(R, c, z, d)) → Node(R, Node(B, a, x,
b), y, Node(B, c, z, d))
| (c, a, x, b) → Node (c, a, x, b)

let add x t =
let rec ins = function
Empty → Node(R, Empty, x, Empty)
| Node(color, a, y, b) →
if x < y then balance (color, ins a, y, b)
else if x > y then balance (color, a, y, ins b)
else raise Found
in try match ins t with
Node (_, a, y, b) → Node(B, a, y, b)
| Empty → assert false (* ins always returns Node _ *)
with Found → t

let rec iter f = function
Empty → ()
| Node(_, l, x, r) → iter f l; f x; iter f r
end

external intset_yield : int → unit = “IntRBSet_iter_yield”
external stringset_yield : int → unit = “StringRBSet_iter_yield”

let identity x = x

open Callback
let _ =
let def_set t =
let r name f = register (t ^ “RBSet” ^ “.” ^ name) f in
r “empty” (fun () → RBSet.empty);
r “add” (fun t x → RBSet.add x t);
r “mem” (fun t x → RBSet.mem x t);
r “dump” identity;
r “make” identity;
in
List.iter def_set [“Int”; “String”];
register “IntRBSet.iter” (RBSet.iter intset_yield);
register “StringRBSet.iter” (RBSet.iter stringset_yield);

tulkas · June 22, 2007, 2:32pm

Hi Mauricio,

a quick thought that I’m pretty sure you’ve already had: I notice that
you define the interface in Ruby, and also define what should be
exported in OCaml. Couldn’t you, say, just define it in OCaml (where the
type sigantures will be know completely, I guess?), and have the Ruby
interface generated from this?

I’ve been meaning to learn oCaml; if i do, I’ll definitely give this a
look.

Thanks,
Benjohn

tulkas · June 26, 2007, 10:12am

On 6/22/07, Mauricio F. [email protected] wrote:

rocaml allows you to write Ruby extensions in Objective Caml.

exciting stuff! looking forward to playing with this.

martin

tulkas · June 23, 2007, 6:05pm

On Fri, Jun 22, 2007 at 09:31:12PM +0900, [email protected] wrote:

Hi Mauricio,

a quick thought that I’m pretty sure you’ve already had: I notice that
you define the interface in Ruby, and also define what should be
exported in OCaml. Couldn’t you, say, just define it in OCaml (where the
type sigantures will be know completely, I guess?), and have the Ruby
interface generated from this?

Even though the types are known by the compiler, human intervention is
needed
at some point because:

we have to define what needs to be exported
the naming and parameter passing conventions might differ (e.g. the
data structure often being given after the element to operate with in
functional data structures)
polymorphic functions have too broad a type, and the concrete
type(s) you want must be specified. For instance, in the RB tree
example,
two classes are generated for sets of strings and ints. Their instance
methods correspond to the same polymorphic OCaml functions, but
providing
the desired concrete types allows the wrapper generated by OCaml to
perform
the necessary type checking and Ruby->OCaml conversions. The
alternative
would be wrapping Ruby values in OCaml with an abstract universal
type, but
introducing dynamic typing defeats the purpose of writing an extension
in
OCaml to some extent (it’ll be slower and the type system will not
help you
that much).

That said, it would be possible to encode all that information in a .ml
file,
instead of splitting it into an OCaml part (which functions are to be
exported) and another in Ruby, in extconf.rb (how that functionality is
accessible from Ruby, and the method signatures). It’s a bit harder to
implement though, as building the extension would involve an extra stage
to
compile the file holding that information and extract it in order to
generate
the wrapper. Going the other way around, specifying it all in extconf.rb
and
generating the .ml code that registers the functions from it would be
very
easy to implement, but would force one to use named functions(1).

In the meantime, I don’t find the need to register the functions in
OCaml too
onerous, as it’s at most one line per method, and the extra degree of
freedom
in the OCaml -> Ruby mapping is quite convenient (I can do e.g.
parameter
reordering with an anonymous function).

(1) another benefit from that would be the possibility to check that the
specified types are included in those from the .cmi (a file generated by
OCaml
with interface information)

tulkas · June 26, 2007, 11:59am

Mauricio:
snip

In the meantime, I don’t find the need to register the functions in
OCaml too
onerous, as it’s at most one line per method, and the extra degree of
freedom
in the OCaml -> Ruby mapping is quite convenient (I can do e.g.
parameter
reordering with an anonymous function).

snip

Thanks for the reasoned reply. I’ve got ocaml installed, and I’ve
started reading the manual. It feels somewhat like it is to functional
programming as c is to imperative programming at the moment, which is
interesting - but could be a useless thought I don’t like the syntax
a lot at the moment, but that’s just lack of experience. I’m very
intregued by the open GL bindings.

tulkas · June 26, 2007, 6:32pm

On 6/22/07, Mauricio F. [email protected] wrote:

in Ruby and OCaml (OCaml exceptions are passed to Ruby).

Nice. I’ve been learning OCaml for the last few months. I’ll
definitely give this a try.

Phil

tulkas · June 29, 2007, 8:03pm

On Jun 22, 5:35 am, Mauricio F. [email protected] wrote:

rocamlallows you to write Ruby extensions in Objective Caml

Why go half way? How about a Ruby interpreter implemented in OCaml?

ORuby now!

Dan

tulkas · June 27, 2007, 12:01pm

On Tue, Jun 26, 2007 at 06:58:03PM +0900, [email protected] wrote:

started reading the manual. It feels somewhat like it is to functional
programming as c is to imperative programming at the moment, which is
interesting - but could be a useless thought

If you mean by this that performance is easy to predict, you’re very
right

Allow me expand a bit on this. One of the distinct advantages of OCaml
is that
the compiler doesn’t perform any deep magic (for instance, it doesn’t do
loop-invariant code motion, IIRC). How can this be a good thing? It
means that
it’s very easy to get from e.g. guessing why so much time is spent in
the GC
to giving the finishing touches to your code, because you can predict
the
effect of your modifications quite easily (and OCaml also has nice
profiling
tools for both native code and bytecode). OCaml’s excellent performance
is a
testament to the effectiveness of a good basic compilation strategy. So
you
can reword the initial statement in a less enigmatic way: Objective Caml
doesn’t need deep magic in the compiler to yield good performance.

This is especially true when you compare OCaml to languages with a lazy
evaluation discipline like Haskell. I don’t have any sizeable experience
with
the latter, but I’ve often heard about the difficulty of predicting the
performance of code involving lazy evaluation, even for seasoned
programmers,
like a single line change mysteriously turning some O(n) code into
O(exp(n))
or the other way around. This caught my attention when I read it in
Okasaki’s
book[1], which I can’t recommend enough:

“Historically, the most common technique for analyzing lazy programs
has been
pretending that they are actually strict”.

Okasaki introduces a basic framework to perform such analyzes, but the
fact
remains that eager evaluation is easier to understand in that regard.

I don’t like the syntax a lot at the moment, but that’s just lack of
experience. I’m very intregued by the open GL bindings.

There are a few problems with the original syntax, but nothing that
cannot be
solved with a couple parentheses here and there (for instance, with
nested
pattern matching expressions). There is an alternative syntax (“revised
syntax”) which addresses these “ambiguities” by adding some punctuation
and
differentiating constructions that are arguably too hard to tell apart
in the
original syntax, but it doesn’t seem to be widely used (I prefer the
original
one myself, but it’s probably because it’s the first one I was exposed
to).
camlp4 can take code in one syntax and rewrite it using the other, so
you
don’t have to decide upfront what you will be using in the end.

[1]
Purely Functional Data Structures by Chris Okasaki, Cambridge University
Press, 1998.

You can find the PhD. thesis on which that book was based at
http://www.cs.cmu.edu/~rwh/theses/okasaki.pdf

tulkas · June 30, 2007, 8:52am

On 6/29/07, Daniel B. [email protected] wrote:

On Jun 22, 5:35 am, Mauricio F. [email protected] wrote:

rocamlallows you to write Ruby extensions in Objective Caml

Why go half way? How about a Ruby interpreter implemented in OCaml?

ORuby now!

I’ve been thinking along similar lines.
Here’s a Google tech talk about compiling Python to OCaml:
http://video.google.com/videoplay?docid=-2077755378178864152&q=python+ocaml

It should be possible to do something similar with Ruby.
…yet another Ruby implementation.

Phil