Using binding + set_trace_func to capture execution state

aris · July 3, 2012, 3:11am

Hi guys, I’m interested in building a program that will display the
callgraph trace of a program and at the same time will allow you to
inspect the variables in your program after the execution has ended
(diff from using debugger to halt execution and inspecting the state).

Initially I thought I could use set_trace_func and store the binding in
a table and just retrieve that binding later on to evaluate the
resulting expression of my target variable, but it’s not behaving as I
expected. In the code provided below, I thought running it would output
“1 3 5”, but I get “5 5 5” instead.

What am I missing here? Is there another way to go about what I’m trying
to do? Thanks!

binding_table = {}

def hello
x = 1 # line no 4
x = 3 # line no 5
x = 5 # line no 6
end

set_trace_func proc { |event, file, line, id, binding, classname|
if event == “line”
binding_table[line] = binding
end
}

hello

set_trace_func nil

puts eval(“x”, binding_table[4])
puts eval(“x”, binding_table[5])
puts eval(“x”, binding_table[6])

reginald_t · July 3, 2012, 6:00am

Reginald T. писал 03.07.2012 05:11:

I

}

hello

set_trace_func nil

puts eval(“x”, binding_table[4])
puts eval(“x”, binding_table[5])
puts eval(“x”, binding_table[6])

Oh, it’s very simple. The `binding’ object denotes the variable scope.
The
scope doesn’t change in this case; the values do.

You can think of a Binding this way (long story short, it’s implemented
very roughly like that in Rubinius):

class Binding
attr_accessor :locals

def initialize
@locals = {}
end
end

… and of variable access this way:

def func
binding.locals[:x] = 1 # x = 1
p binding.locals[:x] # p x
binding.locals[:x] = 2 # x = 2
end

Now the answer should be obvious.

To achieve your goal you’d need to copy the state of variables at each
tracefunc invocation. Even worse, as the objects themselves may change,
you will need to do a deep copy each time (otherwise you’ll pluck into
exactly the same problem with object instance variables). I would
suggest
marshalling the objects and writing them to something like tmpfs, then
using a special tool to navigate the captured information.

This is going to consume a lot of memory. I repeat: a lot. Like tens
of gigabytes for a complex thing like… Sinatra. And probably
terabytes
for Rails.

reginald_t · July 3, 2012, 8:04am

To achieve your goal you’d need to copy the state of variables at each
tracefunc invocation. Even worse, as the objects themselves may change,
you will need to do a deep copy each time (otherwise you’ll pluck into
exactly the same problem with object instance variables). I would
suggest
marshalling the objects and writing them to something like tmpfs, then
using a special tool to navigate the captured information.

This is going to consume a lot of memory. I repeat: a lot. Like tens
of gigabytes for a complex thing like… Sinatra. And probably
terabytes
for Rails.

Hmm, you’re right, it would not be efficient at all to store all those
variables especially in a large program.

I guess another possible approach is to rerun the program every time i
want to inspect a variable. Since I know which file and line the
variable is, I can use that as a condition on set_trace_func handler on
when to output that variable.

set_trace_func proc { |event, file, line, id, binding, classname|
if event == “line” && file == “myfile.rb” && line == 4
puts eval("@myvar", binding)
end
}

reginald_t · July 3, 2012, 9:02am

On Tue, Jul 3, 2012 at 8:05 AM, Reginald T. [email protected]
wrote:

This is going to consume a lot of memory. I repeat: a lot. Like tens
of gigabytes for a complex thing like… Sinatra. And probably
terabytes
for Rails.

Hmm, you’re right, it would not be efficient at all to store all those
variables especially in a large program.

Especially since you would have to store the complete graph of objects
reachable from every scope because the change might be in an object
referenced through some intermediate objects.

I guess another possible approach is to rerun the program every time i
want to inspect a variable. Since I know which file and line the
variable is, I can use that as a condition on set_trace_func handler on
when to output that variable.

What about debug output? You can make that pretty efficient:

if ENV[“DEBUG”]
def debug; $stderr.puts(yield) end
else
def debug; end
end

debug { sleep 1; “Complex operation” }

Kind regards

robert

reginald_t · July 3, 2012, 7:13pm

What about debug output? You can make that pretty efficient:

if ENV[“DEBUG”]
def debug; $stderr.puts(yield) end
else
def debug; end
end

debug { sleep 1; “Complex operation” }

Kind regards

robert

Hi Robert, I’m not sure I understand what you mean. Isn’t that the same
as just inserting a puts statement in the code or putting logger debug
statements around the code? What I wish to do is to inspect the
variables at any line of execution without modifying the original
program

reginald_t · July 3, 2012, 1:23pm

Have you tried somehow using continuations?

– Matma R.

reginald_t · July 3, 2012, 7:26pm

On Tue, Jul 3, 2012 at 7:13 PM, Reginald T. [email protected]
wrote:

What about debug output? You can make that pretty efficient:

if ENV[“DEBUG”]
def debug; $stderr.puts(yield) end
else
def debug; end
end

debug { sleep 1; “Complex operation” }

Hi Robert, I’m not sure I understand what you mean. Isn’t that the same
as just inserting a puts statement in the code or putting logger debug
statements around the code?

Sort of, just a tad smarter.

What I wish to do is to inspect the
variables at any line of execution without modifying the original
program

Oh, that requirement wasn’t given as far as I can see. That’s
something different of course.

Why would the debugger not work for you? If you know the line (as you
said earlier) you can use that information to set a breakpoint and
inspect data when you’ve hit the breakpoint.

Can you explain what goal you are trying to achieve?

Cheers

robert

reginald_t · July 4, 2012, 12:21am

Why would the debugger not work for you? If you know the line (as you
said earlier) you can use that information to set a breakpoint and
inspect data when you’ve hit the breakpoint.

Can you explain what goal you are trying to achieve?

Sure. I want to create a tool that allows me to quickly be familiar with
3rd party libraries or any existing legacy code base that I’ll be
maintaining.

ctags is pretty helpful in terms looking at the complete method call
chain being executed but is not helpful in getting a big picture of
what’s going on with how my data is being transformed from one line to
another.

Debuggers on the other hand, would give me that flexibility of
inspecting variables, but I’ve experienced too much repetitive typing of
‘s’ for “step” and ‘n’ for “next” and I find that experience really
frustrating. Moreover, since the ruby debugger only allows you to move
forward, i would have to restart the debugging session again if i want
trace things backwards.

So I guess what I want is to combine the power of ‘ctags’ and
‘ruby-debug’, and create a tool that will output the callgraph of a ruby
program or method, wherein not only would I be able to see the source
code of each method call, i would also be able to inspect the variables
backward or forwards and at any line of the execution.

I believe this will give me a big picture of a codebase and see how one
section of the code relates to another

reginald_t · July 4, 2012, 1:22am

Bartosz Dziewoński писал 03.07.2012 15:23:

Have you tried somehow using continuations?

Class: Continuation (Ruby 1.9.3)

– Matma R.

A continuation is basically a copy of the current call stack, somewhat
akin to fork() but with cooperative multitasking. How would it help
here?

You cannot introspect continuations.

reginald_t · July 4, 2012, 2:53am

Peter Z. wrote in post #1067270:

Bartosz Dziewoński писал 03.07.2012 15:23:

Have you tried somehow using continuations?

Class: Continuation (Ruby 1.9.3)

– Matma R.

A continuation is basically a copy of the current call stack, somewhat
akin to fork() but with cooperative multitasking. How would it help
here?

You cannot introspect continuations.

Hi Matma, I tried using continuation approach here
Attempt to use continuation to inspect previous execution state · GitHub but with no luck. The output is still “5
5 5”. I guess that’s because I’m still relying on binding to inspect the
variable

reginald_t · July 4, 2012, 10:49am

On Wed, Jul 4, 2012 at 9:57 AM, Bartosz Dziewoński [email protected]
wrote:

}

a = 5
stuff = ‘asd’
a += 8

set_trace_func nil
pp $values_at_time

That still suffers from the issue of changes in referenced object
which will modify the stored state. Even adding to an Array will
change history. As I said earlier you would need to store a complete
object graph, something like

$values_at_time = {}

set_trace_func proc { |event, file, line, id, binding, classname|
if event == “line”
$values_at_time[line] =
Marshal.dump(binding.eval(“local_variables”).inject({})
{|h,v|h[v]=binding.eval(v.to_s);h})
end
}

Kind regards

robert

reginald_t · July 6, 2012, 6:00am

Nice timing! For kicks, I just wrote a gem that emulates Java’s
behavior of printing stack traces when you hit ctrl-backslash. (The
idea is to debug programs that are stuck in an infinite loop or
something somewhere.) I tried to also print local and instance
variables but hit a wall; this may help with that.

It’s not really ready for prime time but you can check it out:

A

On Wed, Jul 4, 2012 at 1:49 AM, Robert K.

reginald_t · July 4, 2012, 9:57am

You can use the local_variables method to get a list of local
variables in current scope. So maybe something like this:

$values_at_time = {}

set_trace_func proc { |event, file, line, id, binding, classname|
if event == “line”
$values_at_time[line] = binding.eval(‘local_variables.map{|v| [v,
eval(v.to_s)] }’)
end
}

a = 5
stuff = ‘asd’
a += 8

set_trace_func nil
pp $values_at_time

– Matma R.

reginald_t · September 11, 2012, 10:39am

On Mon, Sep 10, 2012 at 9:55 PM, Jonathan T. [email protected]
wrote:

That’s as far as I’ve gotten. It’s very rough and doesn’t handle a lot
of parses yet, but honestly, my biggest unknown now is how to present
the information to the user in a way he/she can interact with it.

You could throw out a prompt. Maybe IRB can help you with that. But
then again, why not use the debugger? Note that the OP’s goal was
different: he wanted to present the state after the execution. For
that simply dumping state during execution might be sufficient. But
what you are trying to to sounds more like debugging (interaction
after the program has finished does not make much sense I guess).

Kind regards

robert

reginald_t · September 10, 2012, 9:55pm

Reginald,

Have you made any progress with this?

I’ve been wanting to do a similar thing, so I’m curious how far you’ve
gotten. I have a rough prototype that uses set_trace_func to get the
local_variables and their values. Instead of storing the binding, I’m
just outputting the variables I care about within my set_trace_func
proc. I’m just using #inspect now, but I suppose any view would do.

To alleviate the problem of too much data, I’m simply whitelisting the
exact methods that I care about exploring. I’m not worrying too much
about this right now.

Also, I’ve found the class, method, and line number of set_trace_func to
be useful. I’ve found that the line number matches the line number in
Ripper’s s-expression output. I don’t know if this is the case in
general, but it’s worked in my proof of concept so far. Combined with
the binding and local_variables, I can then display the source being
executed with values substituted for variables.

That’s as far as I’ve gotten. It’s very rough and doesn’t handle a lot
of parses yet, but honestly, my biggest unknown now is how to present
the information to the user in a way he/she can interact with it.

I’d love to hear any thoughts or ideas of a better way to accomplish
this!

Jon

reginald_t · September 11, 2012, 10:22pm

I’m trying to avoid stepping through the code line by line. I’d like to
get a big-picture view if possible. Also, similar to the OP, I don’t
want to modify the code being inspected either.

Trying to boil it down to my main “feature” I’m trying to implement, I
think it’s removing all intermediate steps between writing code and
seeing its output. But not just its output – also how it got there.
Also, I personally find it tedious to navigate with the keyboard and
type the variable names I want to inspect in the exact spot I want to
inspect them. So I want to be able to avoid that. I’d much rather have
some kind of overlay on top of my source that I can just glance at or
maybe mouse-over.

With current tools, my co-workers and I end up using things like
autotest or guard to detect when a file is saved and then re-run the
tests. The test output is displayed in a terminal, optionally reported
to Growl. When it gets slow loading all those gem dependencies, we use
spork.

This is all fine. I can edit files. I can point my Gemfile to a dir with
:path and add debugger/pry breakpoints or prints, or even edit the
installed gem if needed. I can inspect any value. I just feel it’s
inconvenient. For one, I have to be sure to remove all the breakpoints
b/c they should never make it to production, but there are other issues.
So I thought I’d try to experiment with something else.

This may not be for everyone, but I feel that my ideal development
workflow needs continual evaluation and to reflect the source code back
to me with test values filled in. It doesn’t need to capture all the
data of all the execution flow. As someone pointed out, that would be a
lot of data, and not very useful.

Also, a test (unit, functional, etc.) will print out a bad value that is
asserted. However, I specifically avoid assertions on intermediate
values which are implementation-dependent, because they make tests
brittle. But these intermediate values are often very useful in fixing
the final result. So I’d like to be able to easily see the
intermediate values without having to manually add and remove inspects
or step through the debugger every time.

Anyway, I know it helps to understand a person’s context and motivation
when helping them. So that’s mine. It’s more of a hunch of what I think
might be useful. And I’m trying to make it to find out for sure.

I obviously don’t want to re-implement the Ruby interpreter, and I’d
like to avoid patching it. So when I found set_trace_func, it seemed to
be a good match. However, it calls your proc with file and line number.
This may be fine for a traditional debugger that simply echoes back your
source as a string. But I want to actually parse the code so that I can
do interesting things with it. set_trace_func’s “line” event doesn’t map
perfectly to the parsed AST. The most recent case I’ve run into is an
if-else-end statement that spans multiple lines but is used as an
expression to assign to a variable.

1: result = if some_true_val
2: ‘yes’
3: else
4: ‘no’
5: end
6: puts result

I think set_trace_func is giving me one “line” event for line 1, then
line 2, then line 6. However, according to the actual AST, the
assignment is on line 1, and necessarily must occur after executing
line 2. This is what I mean by saying that they don’t map perfectly. For
my experiment, this mapping is what the majority of the code deals with.

As for the actual UI, I’ve considered using MacRuby. But I’ve found that
MacRuby doesn’t support set_trace_func. So maybe this is the wrong route
to head down.

Anyway, if the OP isn’t listening, I don’t really have a specific
question for the mailing list. …Except maybe this:

Is anyone working on a “Light Table” for Ruby?

reginald_t · September 12, 2012, 11:01am

Thank you for the elaborate answer. I need to muse a bit about this.
Just one quick thought:

On Tue, Sep 11, 2012 at 10:22 PM, Jonathan T. [email protected]
wrote:

Also, a test (unit, functional, etc.) will print out a bad value that is
asserted. However, I specifically avoid assertions on intermediate
values which are implementation-dependent, because they make tests
brittle. But these intermediate values are often very useful in fixing
the final result. So I’d like to be able to easily see the
intermediate values without having to manually add and remove inspects
or step through the debugger every time.

At least that could be fixed with a customized version of assert
methods which would also pp a specific instance (or self as default)
when the assertion fails so you can see all the internal state.

Second thought: you could use set_trace_func and custom assert methods
in order to record method calls and state and only output it when the
assert fails.

Kind regards

robert

reginald_t · September 12, 2012, 11:12am

On Sep 11, 2012, at 13:22 , Jonathan T. [email protected] wrote:

Also, a test (unit, functional, etc.) will print out a bad value that is
asserted. However, I specifically avoid assertions on intermediate
values which are implementation-dependent, because they make tests
brittle. But these intermediate values are often very useful in fixing
the final result. So I’d like to be able to easily see the
intermediate values without having to manually add and remove inspects
or step through the debugger every time.

You make me want to add “should”-level assertions in minitest.