Secret Agent 00111 (#94)

Mission Report

Once again Secret Agent 00111 has saved the world from certain doom. As
usual
the damage report was gratuitous: destroyed vehicles worth my salary
for the
next five years, the usual string of sexual harassment and unwanted
pregnancy
lawsuits we will need to address, and one missing flock of sheep
according to a
local farmer. (Don’t ask!) Full details to follow under separate
cover.

00111’s secret to success really was rather brilliant, though please
don’t let
on that we know that. The agent has more than enough ego as it is.
Recognizing
that Q was not going to come through this time, 00111 took the prototype
device
to a hacker operating under the alias Boris P… (We’re still trying
to link
Boris to a real identity, but we suspect him of numerous famous hacker
incidents.) This turned out to be the key choice for 00111.

I’m sure you recall the brutal demise of 00100 when forced to rely on a
Java
programmer’s abilities while under severe time constraints. Desperate
for
quicker results, 00111 gambled on a not-yet-mainstream community of Ruby
programmers. It turns out these Ruby guys are a hidden community of
elite
hackers who seem to do this kind of this for fun. It’s rather bizarre,
but the
results are undeniable.

Naturally, Q’s gizmo was utterly destroyed during 00111’s daring escape
through
the casino’s canal drainage system. However, we’re now monitoring all
Boris
Prinz communications and have managed to intercept an email message
where he is
bragging to the Ruby community about his success. The message contains
a
prototype of the code 00111 loaded into the gizmo. I’m going to examine
that
code in this next section so we have it on file.

Ruby Code Analysis

We will look at this code slightly out of order, so that we might better
understand the thinking of Boris in his design. First, here is the
heart of how
boris managed to keep the communications so short and to the point:

module SecretAgent00111CommunicationGizmo
  class UndefinedRLE < Exception
  end

  def self.rle events
    raise UndefinedRLE.new if events.size == 0
    rl = []
    truevals = 0
    events.each do |result|
      if result
        truevals += 1
      else
        rl << truevals
        truevals = 0
      end
    end
    raise UndefinedRLE.new if truevals != 0
    rl
  end
end

What you have here is a method that accepts a series of favorable or
unfavorable
events, represented by Ruby’s boolean values true and false. The method
returns
a Run Length Encoding for these events, counting the number of truths
that occur
in a row. In order for this to be possible for any size stream, a
trailing
false must be appended after each each series of truths.

Here’s a sample usage, so you can see what I mean here:

>> SecretAgent00111CommunicationGizmo.rle(
?>   [true] * 10 +    # 10 favorable events
?>   [false]     +    # the trailing false
?>   [false]     +    # 0 favorable events and trailing false
?>   [true] * 10 +    # 10 more favorable events
?>   [false]          # the trailing false
>> )
=> [10, 0, 10]

This method wasn’t used in the gizmo’s actual solution, though it helped
us make
sense of the process. Boriz claimed it was part of some “Test Driven
Development (TDD)” process. Since none of our programmers are familiar
with the
concept, we assume he made it up.

This leads us to the closely related unencode method:

module SecretAgent00111CommunicationGizmo
  def self.unrle rl
    events = []
    rl.each {|n| events += [true]*n + [false]}
    events
  end
end

Here we are examining the reverse operation, which unencodes the counts
and adds
back the trailing false markers.

The gizmo’s actual encoding process was driven by the following method:

require 'stringio'

module SecretAgent00111CommunicationGizmo
  def self.encode events, exponent
    io = StringIO.new
    e = Encoder.new(exponent, io)
    events.each do |result|
      e << result
    end
    e.finish
    io.string
  end
end

Here again we have the events and you can see that we also receive an
exponent,
which we will discuss in a bit. The events are fed to a custom Encoder
object
which encodes them and writes them to the specified IO parameter. A
StringIO
object is used as a stand-in IO here to collect the results in a Ruby
String.

The Encoder is the source of most of the magic. Here is that code:

module SecretAgent00111CommunicationGizmo
  class Encoder
    def initialize exponent, io
      @exponent = exponent
      @io = io
      @events = []
      @bits = ''
      @io << exponent.chr
    end

    def << result
      @events << result
      if result == false
        encode_events
      end
    end

    def encode_events
      n = @events.size - 1
      @events = []
      @bits << "1" * (n / 2**@exponent)
      @bits << "%0#{@exponent+1}B" % (n % 2**@exponent)
      flush_bytes
    end

    def flush_bytes
      # write every 8 bits to the stream:
      @bits.gsub!(/[01]{8}/) do |byte|
        @io << byte.to_i(2).chr
        ''
      end
    end

    def finish
      @events << false
      encode_events
      # fill up remaining byte with "1"s:
      rest = @bits.size % 8
      if rest != 0
        @bits << "1" * (8 - rest)
      end
      flush_bytes
    end
  end
end

The Encoder begins by setting up variables to hold the exponent, IO
stream,
events, and a bit encoding. You can see that it sends the exponent
through the
stream right away. That’s required by the encoding, which works out to
be this:

1.  A byte representing the exponent used to encode events
2.  One or more encoded numbers, each consisting of the following:
    a.  A group of one bits number / 2 ** exponent in length
    b.  A zero bit
    c.  Exponent bits representing number % 2 ** exponent
3.  Any one bits needed to pad the stream length to a multiple of eight

That list, really defines the rest of the methods we are looking at
here.
First, <<() is used to add events and it always triggers the call to
encode_events() when a false is encountered. That method turns the
event list
into the bits described by 2a, 2b, and 2c. encode_events() then
triggers a call
to flush_bytes(), which sends those bits to the stream whenever we have
at least
eight (so they can be converted into binary byte data). Finally,
finish()
handles step 3 by adding the extra one bits needed.

Let’s look at how that turns out when received at the other end. Again,
we have
a simple method driving the process:

require 'stringio'

module SecretAgent00111CommunicationGizmo
  def self.decode data
    events = Decoder.new(StringIO.new(data)).read
    events.pop # remove last false
    events
  end
end

This method wraps the stream in a custom Decoder object and reads the
encoded
events. The last false is removed, because it’s just the marker that
came with
the final series of true events, and the event list is returned.

The Decoder is the final piece of the puzzle:

module SecretAgent00111CommunicationGizmo
  class ExponentUndefined < Exception
  end

  class Decoder
    def bits(string)
      string.unpack("B*")[0]
    end

    def initialize io
      @io = io
      @exponent = nil
      @bits = ''
      @t = @n = nil
    end

    def exponent
      return @exponent if @exponent
      e = @io.getc
      if e
        @exponent = e
      else
        raise ExponentUndefined.new
      end
      @exponent
    end

    def add_events
      n = @n
      n += @t * 2**exponent if @t
      @t = @n = nil
      @events += SecretAgent00111CommunicationGizmo.unrle([n])
    end

    def decode_bits
      loop do
        found = false
        if @bits =~ /^(1+)0/
          @t = $1.size
          @bits.sub!(/^(1+)/, '')
          found = true
        end
        re = Regexp.new("^0([01]{#{exponent}})")
        if @bits =~ re
          @n = $1.to_i(2)
          @bits.sub!(re, '')
          add_events
          found = true
        end
        return unless found
      end
    end

    def read
      @events = []
      begin
        exponent
        data = @io.read
        @bits += bits(data)
        decode_bits
      rescue ExponentUndefined
        # exponent not available yet in stream, try again later
      end
      @events
    end
  end
end

Start with initialize(), which is a setup very similar to Encoder. The
@t and
@n variables are used to track the sequence of true events found and the
encoded
number itself. Next skip to read() which is the primary work method.
It begins
by using exponent() to find the magic number, if present. If the
exponent isn’t
yet available, read() returns zero events and the code can try the
stream again
later. Next, bits() is a trivial wrapper to unpack all of the bytes
(most
significant first) into a String of ones and zeros.

The real work is done in decode_bits(). This method loops over the bits
looking
first for the run of one bits and then for the exponent length
remainder. When
both of those are located, a call to add_events() rebuilds the original
encoded
number which unrle() reverts to a series of events.

Summary

As you can see, 00111 survived thanks the the cunning of the hacker
Boris P…
Had a less skilled coder been involved we might be inducting 01000 right
now.
Please send a care package to Boris’s team: Daniel M., Karl Czisch,
Kero,
and Pierre Barbier de Reuille. We own them all our gratitude.

Don’t let 00111 get too comfortable. We’ll be sending him back out on
assignment tomorrow morning, this time into the villainous Lisp domains
to
hijack their secrets…