Code Review: NewInterpreterAndRubyCF

grujildo · May 1, 2009, 11:05pm

Fixes bugs in the new interpreter and implements more features to
support IronRuby.

```
 Control flow
```

a) Branches
One of the most complicated cases is:

{
f(1, 2, try { g(3, 4, try { goto L } finally { … }, 6) } finally {
… }, 7, 8)
L: …
}

The goto expression here jumps to label L while having 4 items on
evaluation stack. The jump needs to execute both finally blocks, the
first one on stack level 4 the second one on stack level 2. So, it needs
to jump the first finally block, pop 2 items from the stack, run second
finally block and pop another 2 items from the stack and set instruction
pointer to label L.

A goto expression can jump out of any node. It just pops the arguments
that the nodes in between it and the target node pushed on the
evaluation stack. The goto expression can jump into a node that
evaluates arguments only if it carries a value and jumps right after the
first argument (the carried value will be used as the first argument).
Goto can jump into an arbitrary child of a BlockExpression since the
block doesn’t accumulate values on evaluation stack as its child
expressions are evaluated.

Stack depth of every instruction can be calculated in a single pass as
instructions are generated. The following invariants always holds:

StackDepth(instr[n]) = sum{i=0…n-1}(instr[i].ProducedStack -
instr[i].ConsumedStack). [1]

This wasn’t true for previous implementation of stack depth calculation
(previously unconditional branches set stack depth to -1 and exception
handling tweaked stack depth explicitly so that it didn’t correspond to
the generated instructions).

ProducedStack and ConsumedStack properties are obvious for most
instructions. Branching instructions (BranchFalse, BranchTrue, Branch,
Goto, and Throw) are interesting though.
BranchFalse/True and Branch are simple branches that are used internally
by the interpreter to implement branching in logical operations, switch
etc. They can only jump within a block/expression to a target with the
same stack depth.

Branch instruction can carry a value over to the target label (flag
“HasValue” is true). It can also have a “result” (flag “HasResult” is
true). Branch has a result if used in a place that expects a non-void
expression. For example, in a conditional expression: “ ?
branch : ”. It doesn’t have a result if it is used in a block: {
…, branch, …}. We set ProducedStack == HasResult ? 1 : 0 and
ConsumedStack == HasValue ? 1 : 0.

Goto instruction is a subclass of Branch instruction that allows jumping
across blocks to the targets of a different stack depth than it is
itself. It also handles finally blocks. Each try-finally expression
compilation creates a list of gotos that jump out of the try body or
catch handlers. Goto expression compilation adds the produced goto
instruction into the current list. The list is stored in a stack so that
nested try-expressions are handled correctly. Compilation of finally
clause enumerates and marks all gotos that were added to the current
list so that they know all finally clauses to execute on jump.

Similarly, gotos jumping from exception handlers are tracked and marked
by another flag. This flag used for re-aborting a thread (see below) on
a jump from a catch handler.

b) Exception Handling

Throw instruction’s ProducedStack == HasResult ? 1 : 0, where HasResult
has the same meaning as for Branch instruction. Throw consumes the
exception being thrown thus ConsumedStack == 1.
A catch exception handler injects the caught exception into the
evaluation stack before executing the catch block. We need a special
no-op instruction EnterExceptionHandler to preserve the invariant [1].
Any GotoExpressions jumping from try or catch blocks are implemented
using GotoInstruction.
There are no jumps allowed from finally block -
FinallyFlowControlExpression reducible node rewrites the tree if there
are any such jumps.

To handle ThreadAbort exception correctly we also need
LeaveExceptionHandler instruction injected at the end of each catch
handler. It checks if the current thread has been aborted and if so
re-aborts the thread. Otherwise it jumps to the finally clause. The
current thread also needs to be re-aborted if code jumped out of the
first catch block that caught TA using goto instruction.

```
 Closure initialization
```

Local variables in a closure need to be initialized. A new StongBox
needs to be allocated each time the execution enters the variable’s
scope. The shelveset adds InitializeLocalInstruction specialized for
reference types (no-op), immutable value types (primitive numeric types,
date time, etc.), StrongBox of immutable value type, potentially mutable
value types and their StrongBoxed version. A variable of an immutable
struct type could be initialized by a singleton of its boxed default
value. We don’t need to allocate a new object each time. A mutable
struct needs to be initialized every time via Activator.CreateInstance.

 Instruction.ToString and Instructions.DebugView

Wraps instruction list into “Instructions” class that derives from
List and provides a custom DebugView. The debug view shows
stack depths of all instructions.
Adds InstructionName virtual property on Instruction that could be
overridden by subclasses to display instruction names.
Adds LocalAccessInstruction that all local variable handling
instructions derive from. The instruction holds on a name of the local
variable if compiled in DEBUG mode. It displays the name in ToString
along with the index of the local.

```
 Equality comparison
```

Implements equality comparison for all primitive numeric types. A simple
micro-benchmark shows that EqualityComparer.Equals() is 1.5 to
2-times slower than operator “==”:

public override bool Eq(object a, object b) {
return (ushort)a == (ushort)b;
}

public override bool Eq(object a, object b) {
return eq.Equals((ushort)a, (ushort)b);
}

for (int i = 0; i < N; i++) { instr.Eq(x, y); }

So I chose to add an instruction per primitive type.

```
 Misc
```

Renamed StackFrame to InterpretedFrame to avoid conflicts with
System.Diagnostics.StackFrame class.
Removes -D -O -X:SaveAssemblies options from Run0 so that it runs with
adaptive compilation on.
Replaces “Instruction GetInstruction” with “void AddInstructions” in
IInstructionProvider interface - nodes can/do translate to multiple
instructions.

Ruby changes

     Some control flow refactoring. Added reducible nodes that are

also instruction providers and handle stack tracing in both interpreted
and compiled modes.

     Each Ruby frame now builds stack trace for any exception that

goes thru it (unless the trace has already been built for that
exception), so there is no need to explicitly do that anymore.

     Adds unit tests for interpreter - these should get moved to

DLR eventually.

     Control flow refactoring and fixes in jumps from eval.

```
     Fixes bug 
```

http://ironruby.codeplex.com/WorkItem/View.aspx?WorkItemId=572 and other
eval issues related to control flow.

Tomas