Brite - A ruby compiler for the .NET platform


#1

Hi folks!

I discovered ruby some month ago and I liked it in a bunch of minutes.
Then I was involved in a project using ASP.NET. At this point I had a
dream :wink: Ruby that works on .NET (knowing ruby, neither VB nor C#
fits!) .
I found some material, but nothing useable or 100% .NET, so I took
some time and decided to create a ruby compiler.

Here it is… well, at least the first version which is early alpha.

Get it, try it, read it, and tell me what you think. Is this project
worth
continuing?

You can read some info about it and also download it here:
http://mortimer.devcave.net/projects/brite

Cheers,

Pascal H.


#2

Pascal H. wrote:

I discovered ruby some month ago and I liked it in a bunch of
minutes. Then I was involved in a project using ASP.NET. At this
point I had a dream :wink: Ruby that works on .NET (knowing ruby,
neither VB nor C# fits!) . I found some material, but nothing useable
or 100% .NET, so I took some time and decided to create a ruby
compiler. […] Get it, try it, read it, and tell me what you think.
Is this project worth continuing?

This looks very promising and I hope it will continue to be developed.
I’d really like to see something like this working very well. Perhaps
even in a boot-strapped, self-hosting way which would be possible with
.NET. And I think your approach is going to be self-hosted. This is the
way I have dreamed of doing it in the past.

Regarding implementation of the standard library. I’ve done part of it
in JavaScript [1] already and another very small part in Ruby [2].

I’ve tried to make the implementation as self-based and simple as
possible.

I hope it is of some use to you.

I’m not sure in what state your parser is currently in, but you might be
able to heavily cheat with this by either only accepting YARV byte code
as input or by letting Ripper or ruby -y do the hard work for you. This
would mean needing to call an external binary for things like eval() at
first, but for an initial version that would be good enough, I think.

Continuations are very hard to do, but it might be possible to do them
for the Ruby side with a heavy transformation of method calling code.
I’ve thought about this in the past [3], but I think
continuations could still be added very, very late when other things
already work.

Regarding .NET interoperability: Using Ruby objects from .NET is the
hardest to do. Think about things like method_missing() and you will
realize that rubyObj.method() will not be good enough. You probably
would need to do rubyObj.send(“method”) in all .NET code.

The other way around is more important at first because it would allow
you to implement the standard library in Ruby while using .NET
infrastructure. (Very handy for things like Regular Expressions, Date
stuff, IO and probably yet more.)

It would be very wonderful if the run time could automatically translate
Ruby’s blocks to .NET’s delegates and so on. But this is a convenience
feature as well. Not strictly necessary at first.

A method definition can also use “pass remaining param as an array”.
What should we do with this? Is there vararg in CLS ?

From what I understand C# and so on implement varargs by accepting an
array as the last argument. You can forward arguments that way as well.
So it’s pretty much syntax. The same could be done for Ruby. You could
annotate the methods as having varargs to make the distinction clearer.

Regarding blocks: Mono’s JScript.NET solves that kind of thing (how to
pass information for closures and so on) by passing in a single engine
argument. It can be used to find out information for context and so on.
I imagine this would also work well for Ruby. A delegate would mostly
work, but you can’t handle this case:

a = "foo"
foo() do
  eval("a")
end

Because you don’t see the access of the variable a in the block at
compile time. So you need some kind of mechanism to access parent scopes
at run time. This is what the engine would do.

I’d ignore the whole String thing for now. Ruby’s strings are going to
change quite a bit very soon. For now I think it would be enough to do
the simplest thing: Unicode strings.

Hope all of this is of some help to you. Good luck!

[1] http://flgr.dyndns.org/ruby.js/
[2] http://flgr.dyndns.org/code/method-dict.rb
[3] http://flgr.dyndns.org/callcc-cil.rb


#3

Would this allow Ruby to be compiled to CIL?


#4

Most people are already aware of this, but just for the newcomers,
this group is also working towards similar goals:

http://plas.fit.qut.edu.au/rubynet/

Although I, personally, am cheering for Florian, since the result of
his efforts will be truly open source, as opposed to the QUT effort
that will be (to quote their home page):

‘under a relatively liberal open source license that largely allows it
to be freely used by anyone in any way and for any purpose.’

I’m a little wary of the ‘relatively’, ‘largely’ and ‘used by’ phrases
(as opposed to ‘modified by’). We’ll have to see.

Dan


#5

Just to head off any confusion…

I don’t mean to imply that the QUT license shouldn’t have any
restrictions, since most OS licenses do (for good reason). My only
concern is that when the license is announced, it may turn out not to
be very open at all.

So I’m basically just hoping that they’re wise about which license
they use, since so far they haven’t committed to anything specific
yet.

Dan


#6

Kris wrote:

Would this allow Ruby to be compiled to CIL?

It would be the ultimate goal of the project as far as I understand. Of
course you would still need a run time as well.

Oh, and don’t expect 100% feature completeness too soon. There’s a lot
of edge cases that are hard to get right.

But it might happen and I hope it eventually will happen.


#7

I guess a Rails app would be more difficult to compile…
Is CIL reverseable back to code like Java bytecode is? And is it suited
to holding a decryption key from preying eyes, like you would with a
native binary?

Florian GroÃ? wrote:

Kris wrote:

Would this allow Ruby to be compiled to CIL?

It would be the ultimate goal of the project as far as I understand. Of
course you would still need a run time as well.

Oh, and don’t expect 100% feature completeness too soon. There’s a lot
of edge cases that are hard to get right.

But it might happen and I hope it eventually will happen.


#8

I guess a Rails app would be more difficult to compile…
Is CIL reverseable back to code like Java bytecode is? And is it suited
to holding a decryption key from preying eyes, like you would with a
native binary?

Florian GroÃ? wrote:

Kris wrote:

Would this allow Ruby to be compiled to CIL?

It would be the ultimate goal of the project as far as I understand. Of
course you would still need a run time as well.

Oh, and don’t expect 100% feature completeness too soon. There’s a lot
of edge cases that are hard to get right.

But it might happen and I hope it eventually will happen.


#9

On 5/21/06, Kris removed_email_address@domain.invalid wrote:

Would this allow Ruby to be compiled to CIL?

I think a better tack might be to make a JIT compiler into CIL. This
would allow things like Ruby’s eval to continue to function.


#10

Kris wrote:

I guess a Rails app would be more difficult to compile…
Is CIL reverseable back to code like Java bytecode is? And is it suited
to holding a decryption key from preying eyes, like you would with a
native binary?

A Rails app wouldn’t be any more difficult than any other Ruby app. If
you’ve ported all non-ruby libraries Rails uses (database bindings etc.)
to .net it would be just as easy as any other app. It’s just Ruby code.

I believe there exists a .net disassembler, so you could inspect the
assemblies.


#11

Kris wrote:

I guess a Rails app would be more difficult to compile…

Once you have support for most of the language it ought to be possible.

Is CIL reverseable back to code like Java bytecode is?

Yes, as is any other byte code format.

And is it suited
to holding a decryption key from preying eyes, like you would with a
native binary?

I don’t understand how a decryption key would be hold from preying eyes
by a native binary. Wouldn’t you still be able to extract it?


#12

I found a one page html page detailing how to decrypt a CIL back to
source like you can with Java bytecode… Its just too easy to do!
Getting a key out of binary can be made to be difficult, not impossible.
With bytecode its just all to easy.
REally we need encrypted ruby code, but there seems to be resistance to
the idea in the ruby community…

Florian GroÃ? wrote:

Kris wrote:

I guess a Rails app would be more difficult to compile…

Once you have support for most of the language it ought to be possible.

Is CIL reverseable back to code like Java bytecode is?

Yes, as is any other byte code format.

And is it suited
to holding a decryption key from preying eyes, like you would with a
native binary?

I don’t understand how a decryption key would be hold from preying eyes
by a native binary. Wouldn’t you still be able to extract it?


#13

On 5/22/06, Kris L. removed_email_address@domain.invalid wrote:

I found a one page html page detailing how to decrypt a CIL back to
source like you can with Java bytecode… Its just too easy to do!
Getting a key out of binary can be made to be difficult, not impossible.
With bytecode its just all to easy.
REally we need encrypted ruby code, but there seems to be resistance to
the idea in the ruby community…

In some sense I suppose encrypting code is against the spirit of Ruby,
but from my perspective the problem is more of practicality. The basic
issue is that if the computer can decrypt the code to run it, then
someone else can as well. You will have to embed the key somewhere in
the binary, and it would be trivial to run the interpreter in a
debugger to see where the key was hidden.

This is the same problem big media is running against with their
incessant and obsessive fight against supposed “copyright
infringement” with DRM technologies. At some point the data has to be
unencrypted for humans to see, read or hear it. Unless of course we
all have special chips embedded in our brains that allow us to only
see or hear content specifically licensed to us. That is the next
logical step. Be prepared for the introduction to Congress of the
Omnibus Verifying Entertainment Revenue with Licensing Or Restriction
of Data Act (aka the OVERLORD Act) sometime in 2008. Your chip awaits.
</end_copyright_tirade>

Ryan


#14

Pascal H. wrote:

Hi folks!

I discovered ruby some month ago and I liked it in a bunch of minutes.
Then I was involved in a project using ASP.NET. At this point I had a
dream :wink: Ruby that works on .NET (knowing ruby, neither VB nor C#
fits!) .
I found some material, but nothing useable or 100% .NET, so I took
some time and decided to create a ruby compiler.

Here it is… well, at least the first version which is early alpha.

Get it, try it, read it, and tell me what you think. Is this project
worth
continuing?

You can read some info about it and also download it here:
http://mortimer.devcave.net/projects/brite

Cheers,

Pascal H.

Have you seen IronPython yet?

It’s Python with .NET and it’s being sponsored by Microsoft.

http://www.ironpython.com/

Here is a video of it.
http://msdn.microsoft.com/msdntv/episode.aspx?xml=episodes/en/20051110PythonJH/manifest.xml

Pretty nice!!


#15

On 5/22/06, Kris L. removed_email_address@domain.invalid wrote:

Encryption would offer quite a bit of protection, you could hide a key
well, not impossible to find but enough to make it easier to write the
app from scratch than go to the trouble of steeling source code.
:slight_smile:

I feel stolen code is a legal problem, not a technical one. For one
thing, as I have argued, it is a very hard if not impossible problem
to solve with a language like Ruby (and most other languages.) But the
same aspects of Ruby that make your code available to steal also make
it easy to see who has stolen your code. At that point you can put
some legal hurt on the parties that have stolen the code.

Whereas if you build some system for encrypting the code that is
actually easily broken people can steal your code and you may never
suspect it, since you will think your code is safe.

In addition, with our current world being full of open source projects
for just about every conceivable thing, the “value” of source code has
been greatly reduced. In fact if anything there needs to be more “code
stealing” in the form of reuse because everyone always seems to want
to reinvent the wheel (at least in open source software.)

If you really feel your code is valuable then you need to make it so
that the code is never on the client machine, which means some kind of
online service, like a web-site or SOAP API.

Ryan


#16

If its against the spirit of ruby then it makes it less commercially
useable since code can’t be distributed in a closed way. I do hear a lot
of resistance to encrypted ruby cos people are just self hosting app’s
which is fine, but…

Encryption would offer quite a bit of protection, you could hide a key
well, not impossible to find but enough to make it easier to write the
app from scratch than go to the trouble of steeling source code.
:slight_smile:
Ryan L. wrote:

On 5/22/06, Kris L. removed_email_address@domain.invalid wrote:

I found a one page html page detailing how to decrypt a CIL back to
source like you can with Java bytecode… Its just too easy to do!
Getting a key out of binary can be made to be difficult, not impossible.
With bytecode its just all to easy.
REally we need encrypted ruby code, but there seems to be resistance to
the idea in the ruby community…

In some sense I suppose encrypting code is against the spirit of Ruby,
but from my perspective the problem is more of practicality. The basic
issue is that if the computer can decrypt the code to run it, then
someone else can as well. You will have to embed the key somewhere in
the binary, and it would be trivial to run the interpreter in a
debugger to see where the key was hidden.

This is the same problem big media is running against with their
incessant and obsessive fight against supposed “copyright
infringement” with DRM technologies. At some point the data has to be
unencrypted for humans to see, read or hear it. Unless of course we
all have special chips embedded in our brains that allow us to only
see or hear content specifically licensed to us. That is the next
logical step. Be prepared for the introduction to Congress of the
Omnibus Verifying Entertainment Revenue with Licensing Or Restriction
of Data Act (aka the OVERLORD Act) sometime in 2008. Your chip awaits.
</end_copyright_tirade>

Ryan


#17

On Tue, 23 May 2006 03:04:23 +0900, “Ryan L.”
removed_email_address@domain.invalid wrote:

Whereas if you build some system for encrypting the code that is
actually easily broken people can steal your code and you may never
suspect it, since you will think your code is safe.

Not to mention that they could use the same obfuscation techniques to
hide the fact that they misappropriated code in the first place.

-mental


#18

On Tue, 23 May 2006 03:24:03 +0900, Kris L.
removed_email_address@domain.invalid wrote:

Its not just having code stolen but the fact that it can be modified so
that it dumps its data to file or screen… If your dealing with
sensitive data thats a problem.

If the data is sensitive, why are you putting it on an unsecured
machine?

Obfuscation might stop the honest user who’s just curious, but if you’re
dealing with a malicious user (especially if they have a financial
incentive) then sending the data to a machine they control is simply a
free gift.

-mental


#19

Kris L. wrote:

I found a one page html page detailing how to decrypt a CIL back to
source like you can with Java bytecode… Its just too easy to do!
Getting a key out of binary can be made to be difficult, not impossible.
With bytecode its just all to easy.
REally we need encrypted ruby code, but there seems to be resistance to
the idea in the ruby community…

No, I think you’re missing the point.

Bring the issue up on sci.crypt and see what they say.

Hal


#20

Kris L. wrote:

Its not just having code stolen but the fact that it can be modified so
that it dumps its data to file or screen… If your dealing with
sensitive data thats a problem.

Ok. What we have here is 3 separate problems:

  • Data security. Don’t want people who shouldn’t being able to get hold
    of the data.
  • Code security. Don’t want people who shouldn’t being able to get hold
    of the source (or a usable representation of it).
  • Code validity. Don’t want untrusted code running at all.

The first is answered by classical computer security concerns. The last
requires trusted hardware. The second is technically impossible but
legally enforceable. There may be mileage in putting enough of a
barrier up to wave the DMCA in peoples’ faces if they cross it (in
addition to whatever contractual/copyright/licencing system you put
around your code), but that doesn’t deal with the problem of finding out
when they have.

I feel I should also point out that despite how easy it is to reverse
CIL and Java bytecode, there is a thriving commercial market in software
dealing with sensitive data for both those platforms.