Literate Ruby (#102)

We definitely have to do more quizzes where the nature of the problem
encourages
submitters to summarize their own solution! Multiple submitters did
just that,
so I recommend taking the time to read through the submission emails if
you
haven’t already.

Before we get to the solutions, let me make sure everyone knows about
the
feature similar to this quiz already baked into Ruby. You can often use
the -x
switch to execute code buried inside of normal content, like an email
message.
Here’s an example:

$ ruby -x fake_email.txt
This is for running Ruby code inside other text!

The code is assumed to start at the Shebang line and end at __END__.
$ cat fake_email.txt
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do
eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim
ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut
aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit
in voluptate velit esse cillum dolore eu fugiat nulla pariatur.
Excepteur sint occaecat cupidatat non proident, sunt in culpa qui
officia deserunt mollit anim id est laborum.

#!/usr/bin/env ruby -w

puts <<END_OUTPUT
This is for running Ruby code inside other text!

The code is assumed to start at the Shebang line and end at __END__.
END_OUTPUT

__END__

Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do
eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim
ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut
aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit
in voluptate velit esse cillum dolore eu fugiat nulla pariatur.
Excepteur sint occaecat cupidatat non proident, sunt in culpa qui
officia deserunt mollit anim id est laborum.

Of course, this is not perfect. It does not handle documents that
slowly build
up code as they discuss it. For that, we will need to go to the
solutions.

The parser for this quiz isn’t overly complex to write. Most people
used a
couple of regular expressions to locate the code. Here is one such
parser from
Cameron Pope:

class LRB
  def parse(io)
    current_state = :in_text
    io.each_line do |line|
      if current_state == :in_text
        case line
        when /^>\s?(.*)/: yield :code, $1 + "\n" if block_given?
        when /\\begin\{.*\}\s*.*/: current_state = :in_code
        else yield :text, line if block_given?
        end
      else
        case line
          when /\\end\{.*\}\s*.*/: current_state = :in_text
          else yield :code, line if block_given?
        end
      end
    end
  end
end # class LRB

This parser walks the passed IO object line by line. Each line of
content is
yielded to the provided block along with a type identifier. The parser
begins
by assuming the content it is reading is :text, and it yields lines with
that
type. However, if a line begins with the email quote marker (>), that
line will
be yielded with a :code type. When a LaTeX style marker is found
(\begin{code}), the parser switches modes to assume all following lines
are now
code, until it encounters the matching marker (\end{code}).

With a parser in place, the interesting element becomes the supported
forms of
output. Here are those methods from Cameron’s code:

require 'rubygems'
require 'bluecloth'
class LRB
  def self.to_code(io)
    code = String.new
    LRB.new.parse(io) do |type, line|
      code << line if type == :code
    end
    return code
  end
  def self.to_markdown(io)
    doc = String.new
    LRB.new.parse(io) do |type, line|
      case type
      when :code: doc << "    " << line
      when :text: doc << line
      end
    end
    return doc
  end
  def self.to_html(io)
    markdown = self.to_markdown io
    doc = BlueCloth::new markdown
    doc.to_html
  end
end # class LRB

The to_code() method is the most basic. It just uses the parser to walk
the
document content, accumulating all of the code it finds along the way.
In the
end, it returns the collected code.

The to_markdown() method is similar, but it collects text and code.
Text is
added normally, but code is indented four spaces to match the rules of
Markdown.
The resulting Markdown content is returned.

From there, to_html() is trivial. The document is converted to Markdown
using
the method we just examined and then handed off to BlueCloth for
translation.

The Markdown option is a great fit here, since it was designed with
human
readability in mind. The whole point of Literate Programming is to
write about
code, and we obviously want people to read what we write, so that’s a
good
match.

All we have left is Cameron’s interface code:

if $0 == __FILE__
 opt = ARGV.shift
 file = ARGV.shift
  case opt
    when '-c': puts LRB::to_code(File.new(file))
    when '-t': puts LRB::to_markdown(File.new(file))
    when '-h': puts LRB::to_html(File.new(file))
    when '-e': eval LRB::to_code(File.new(file))
    else
      usage = <<"ENDING"
Usage:
  lrb.rb [option] [file]

Options:
  -c: extract code
  -t: extract text documentation
  -h: extract html documentation
  -e: evaluate as Ruby program
ENDING
    puts usage
  end
end

Here you see a simple set of four supported options. The first three
are the
basic conversions we just examined. The fourth option also pulls the
code, but
it eval()s it, instead of printing the results.

Another step a couple of the solutions took was to enhance require() to
locate
.lrb files. Here’s an example of how this is accomplished, by Vincent
Fourmond:

# Here, we hack our way through require so that we can include
# .lrb files and understand them as literate ruby.
module Kernel
  alias :old_kernel_require :require
  undef :require
  def require(file)
    # if file doesn't have an extension, we look for it
    # as a .lrb file.
    if file =~ /\.[^\/]*$/
      old_kernel_require(file)
    else
      found = false
      for path in ($:).map {|x| File.join(x, file + ".lrb") }
        if File.readable?(path)
          found = true
          RWeb::run_code(RWeb::unliterate_file(path).first,
                         self.send(:binding))
          break
        end
      end
      old_kernel_require(file) unless found
    end
  end
end

The comments explain the process pretty well here. The idea is to check
for a
.lrb file in Ruby’s load path, for any require() without an extension.
If such
a file is found, it’s loaded via Vincent’s RWeb Literate Ruby processor.
If not
found or if the file had an extension, is is passed-through to Ruby’s
own
require() for traditional handling.

My thanks to all the people who unknowingly helped me design the
quiz/summary
format and parser for Ruby Q. 2.0. As always, the solutions
introduced great
new tricks I never would have thought of.

Tomorrow we will tackle a question commonly asked on Ruby T. in the
hopes that
we can answer if once and for all…