Ruby Forum Ruby-core > sandbox API

Posted by _why (Guest)
on 25.04.2008 00:13
(Received via mailing list)
Hi, everybody.

In the #ruby-core design meeting, during the discussion about MVM,
there was some mention of the sandbox API.  I thought it would be
worth while to write up an RCR.  I mean: all though there has been
some talk about the sandbox extension for Ruby 1.8 on this list,
there hasn't been any talk about the API itself.

Considering that $SAFE has fallen out of use and there is a renewed
interest in managing many namespaces/environments on a single VM,
I figured hey.

ABSTRACT

Ruby of yore has only had one interpreter environment.  The sandbox
API gives that central environment a means of creating other
in-process environments for executing code.  Be it restricted
sandboxes for running unsafe code or fully-featured sandboxes to
offer a clean namespace.

PROS & CONS

The benefits of this particular API:

  * Rather simple (yeah?)
  * Basic (albeit unstable) extensions exist for Ruby 1.8 and JRuby.
  * Patterned after other successful sandboxes (such as Firefox's
    XPCNativeWrapper[1] and Io's Sandbox[2])
  * Generic enough to work in other Ruby impls.

The drawbacks are:

  * Not fully proven on Ruby 1.8.
  * My extension does rely on Thread.kill! to stop a Sandbox,
    which is taboo.  (Same problem timeout.rb has.)
  * Haven't worked out how tainting could play out.
  * Could be closer coupled with threading to offer concurrent
    interps in separate threads.

THE API

All classes and methods are enclosed in the Sandbox module.

The primary classes are Sandbox::Full and Sandbox::Safe.
Sandbox::Safe is descended from Sandbox::Full.

Methods for these two classes are:

  * self.new(opts = {})
    Returns a newly created sandbox.
    Available options: :init, :ref

  * eval(str, opts = {}) => obj
    Evaluates +str+ as Ruby code inside the sandbox
    and returns the result.
    Available options: :timeout

  * load(io, opts = {}) => nil
    At heart, just an alias for: eval(IO.read(io), opts)

  * ref(klass) => nil
    Adds a boxed reference to +klass+ in the sandbox.
    (Ex.: @box.ref(YAML) would create a YAML class in the
    sandbox which is derived from Sandbox::BoxedClass, a
    proxy to the YAML class on the outside.)

  * require(str)
    Requires a file into the Sandbox, using the $LOAD_PATH and
    file permissions of the current sandbox.

The Sandbox module itself has a few methods:

  * Sandbox.safe(opts = {})
    An alias for Sandbox::Safe.new(opts)

  * Sandbox.new(opts = {})
    An alias for Sandbox::Full.new(opts)

  * Sandbox.current
    Returns an object representing the current Sandbox.

  * Sandbox.screen(obj) => true or Sandbox::ScreenException
    Traverses an object and its related symbols to be sure
    it is entirely composed of objects from the current
    sandbox.  Purely for testing.

As for the `opts` hash in the above methods, here's a brief
description of those:

  * init: The portions of Ruby core to initialize.
      :load - $:, $-I, $LOAD_PATH, $\, $LOADED_FEATURES,
              load, require, autoload, autoload?
      :io - IOError, EOFError, IO, FileTest, File, Dir,
            File::Constants, test, File::Stat,
      :env - syscall, open, printf, print, putc, puts,
             gets, readline, getc, select, readlines,
             p, display, STDIN, STDOUT, STDERR
      :real - abort, at_exit, caller, exit, trace_var,
              untrace_var, set_trace_func, warn, ThreadError
              Thread, Continuation, ThreadGroup, trap,
              exec, fork, exit!, system, `, sleep, Process,
              Process::Status, Process::Sys, GC,
              ObjectSpace, hash, __id__, object_id
      :all - the whole enchilada
      (Sandbox::Full assumes :init => :all and Sandbox::Safe
      assumes :init => nil.)

  * ref: Classes to create boxed references for.
    (Ex.: :ref => [RedCloth, BlueCloth])

  * timeout: Maximum seconds, a time limit for the sandbox.

BOXED CLASSES

Inside each Sandbox, a BoxedClass constant is defined.  This class
has two methods: method_missing and const_missing.

So, let's say you're running a web app in the sandbox.  And you
want it to speak to Mongrel in the main interp.  Imagine a
MongrelConnector class that acts as medium between the two.

  -- master.rb --
  require 'mongrel'

  class MongrelConnector
    def self.each
      str = yield ""
      # send str to mongrel
    end
  end

  box = Sandbox.safe
  box.load 'rails.rb'
  box.ref MongrelConnector
  box.eval 'start'

  -- web.rb --
  def start
    MongrelConnector.each do |cgi|
      cgi << "hallo!"
    end
  end

Inside the sandbox (where web.rb is running,) the MongrelConnector
class is a BoxedClass.  When `each` is called, method_missing
switches sandboxes and runs the method on the class outside the box.
When method_missing gets an answer back, it switches back inside the
sandbox and returns an answer.

For primitive data, such as numbers and strings and floats which
have no instance variables, the data is marshalled.  For other
objects, a Sandbox::Ref is received.  Both inside and outside the
sandbox, a Sandbox::Ref points to data not inside the current
sandbox.  This ref also has a method_missing, which works just like
BoxedClass' method_missing.

It is not allowed to pass a Sandbox::Ref for an object whose class
is not referred to in the receiving sandbox.  So, if, for some
reason, a method call tries to return an IO object to a sandbox and
no IO class is defined (and properly ref'd,) a
Sandbox::TransferException is thrown.

THE PRELUDE

Beyond the API, it is also required that the Sandbox run versions of
common methods which are not exploitable.  For example, the freaky
freaky sandbox has a lib/sandbox/prelude.rb which includes a pure
Ruby version of the `**` method since very high squares can lock the
interpreter up in C.

AND DONE

That's it for now.  I'm not an extreme zealot of this API, so I'd be
glad to alter it or scrap it.  But it has evolved through trial and
error, based on xp points awarded during Try Ruby and the
sandboxed wiki[3].

Thankyou for your generous attentions.

[1] http://developer.mozilla.org/en/docs/XPCNativeWrapper
[2] http://iolanguage.com/scm/git/checkout/Io/docs/IoReference.html
[3] 
http://redhanded.hobix.com/inspect/howToLetAnyoneElseFinishYourWiki.html
Posted by Mental Guy (mental)
on 25.04.2008 00:30
(Received via mailing list)
On Fri, 25 Apr 2008 07:12:39 +0900, _why <why@ruby-lang.org> wrote:
>   * My extension does rely on Thread.kill! to stop a Sandbox,
>     which is taboo.  (Same problem timeout.rb has.)

I don't think this is a big problem in practice -- even though
Thread.kill! is "more evil" than even Thread.kill, we're destroying
any evidence of malfeasance when we tear down the Sandbox VM.

Anyway, using Thread.kill! is an implementation detail which
shouldn't reflect badly on the proposed API itself.

-mental
Posted by Tadashi Saito (Guest)
on 28.04.2008 03:06
(Received via mailing list)
Hi,

On Fri, 25 Apr 2008 07:12:39 +0900
_why <why@ruby-lang.org> wrote:

>   * eval(str, opts = {}) => obj

I think eval(string) is <del>evil or</del> too ugly and takes more time
especially in 1.9.  It should take block instead of it.

  * eval(opt = {}, &block) => obj

like:

   box.eval {start}
Posted by Nathan Weizenbaum (nex3)
on 28.04.2008 03:34
(Received via mailing list)
I was under the impression that (part of) the purpose of the sandbox was
to run untrusted Ruby code within the context of a larger Ruby
application. I'd imagine that a large portion of the time, this code
enters the application as a string - for example, Try Ruby presumably
accepts strings from the web interface and passes them through this
method to get the result.

I think you're right that if you know what code you're going to be
evaluating when you write the call to eval, passing it as a block would
be preferable. I think the best way to balance this would be to allow
both, like instance_eval.

I imagine if eval only took a block, we'd see a lot of code like

box.eval { eval(str) }
Posted by _why (Guest)
on 30.04.2008 07:34
(Received via mailing list)
On Mon, Apr 28, 2008 at 10:06:05AM +0900, Tadashi Saito wrote:
> I think eval(string) is <del>evil or</del> too ugly and takes more time
> especially in 1.9.  It should take block instead of it.

The sandbox takes the "i" out of eval.  A block would be nice, too,
except that I haven't figured out how to change the block's scope so
that it can't reference anything unsafe from its original habitat.

And Nathan makes some good points too.  But thankyou for caring
about this little idea!

_why
Posted by Mental Guy (mental)
on 30.04.2008 21:28
(Received via mailing list)
On Wed, 30 Apr 2008 14:32:22 +0900, _why <why@ruby-lang.org> wrote:
> On Mon, Apr 28, 2008 at 10:06:05AM +0900, Tadashi Saito wrote:
>> I think eval(string) is <del>evil or</del> too ugly and takes more time
>> especially in 1.9.  It should take block instead of it.
> 
> The sandbox takes the "i" out of eval.  A block would be nice, too,
> except that I haven't figured out how to change the block's scope so
> that it can't reference anything unsafe from its original habitat.

I think it would be necessary to extract the block's parse tree and
use that to construct a new block within the context of the sandbox.
That would of course require that implementors keep the block parse
tree around...

The main downside with a string eval in this case is that the string
must be parsed each time.  Accordingly, avoiding that parsing overhead
would be the main benefit of using a block.  Beyond that, I'm not sure
I see much advantage.  Proxied method calls on wrapped objects are
realistically going to be the main communication method with sandboxes,
so that case needs to be optimized more than eval does.

-mental
Posted by Rocky Bernstein (Guest)
on 30.04.2008 22:01
(Received via mailing list)
On Wed, 30 Apr 2008 14:32:22 +0900, _why <why@ruby-lang.org> wrote:

>   On Mon, Apr 28, 2008 at 10:06:05AM +0900, Tadashi Saito wrote:
> > I think eval(string) is <del>evil or</del> too ugly and takes more time
> > especially in 1.9.  It should take block instead of it.
>
>   The sandbox takes the "i" out of eval.  A block would be nice, too,
>   except that I haven't figured out how to change the block's scope so
>   that it can't reference anything unsafe from its original habitat


At the risk of promoting the idea of giving someone enough rope to hang
him/herself, in Ruby 1.8 there's
Kernel#binding_n<http://bashdb.sourceforge.net/ruby-debug.html#SEC78>from
ruby_debug.