Goedel (#147)

This quiz is really just an optimization problem. It’s pretty trivial
to do a
conversion to and from a Goedel number. See Eric L.'s
under-30-lines
solution for a great example of this. The challenge arises when the
message
gets big enough that finding all of the factors takes significant time.

Eric I. figured out how to cut quite a few corners on the decoding
process, so I
want to take a look at his code.

Before we get into the actual encoding and decoding process though,
let’s talk a
bit about primes. Obviously, we need a source of prime numbers to do
our work.
Ruby does ship with a standard mathn library that includes a Prime
class. Many
solutions did put that library to good use. There are two downsides to
that
approach though: mathn is a pure Ruby library and the Prime class
implementation in Ruby 1.8 is not very clever. Both of these slow us
down.

To get around that, Eric built a drop-in replacement for the Prime class
that
cheats. It simply reads the numbers from a huge list you can download,
skipping
any calculation effort. This turns out to be faster for our needs.
Here’s the
code:

Generates a stream of prime numbers as they’re read from a sequence

of files with names such as “primes1.txt”, “primes2.txt”, and so

forth. Such files can be downloaded from:

http://primes.utm.edu/lists/small/millions/

class Prime
def initialize
@current_file = 0
@io = open_next_file
@current_primes = []
@current_index = 0
end

def next
  load_next_primes until value = @current_primes[@current_index]
  @current_index += 1
  value
end

private

def load_next_primes
  while true
    while line = @io.gets
      if line =~ /^\s*\d+(\s+\d+)*\s*$/
        @current_primes = line.split.map { |e| e.to_i }
        @current_index = 0
        return
      end
    end
    @io.close
    open_next_file
  end
end

def open_next_file
  @current_file += 1
  filename = "primes%d.txt" % @current_file
  begin
    @io = open(filename)
  rescue
    raise "ran out of primes because couldn't open file \"%s\"" %
      filename
  end
end

end

As you can see, this is simple stuff. The class just opens a file
called
primes1.txt when initialize()d (see open_next_file()). As needed, lines
are
read from this file, split() and converted into Integers, and tucked
away inside
an Array (see load_next_primes()). Primes are then just handed out from
this
Array (see next()) and when the supply is exhausted new lines are read.
When we
run out of lines, the code will move on to a primes2.txt file.

The site linked to in the comment has the first 15 million primes
available in
files like this. That more than covers the needs of this code, so this
turns
out to be a simple but effective cheat to save time.

With a zippy Prime class defined, we are ready to get down to the real
work:

require ‘primes’ # or the standard mathn library

Put the coder in a separate class, so we have the potential to use

other coders, such as the one from the Starburst novel.

class RubyQuizCoder
def encode(char)
char[0] + 1
end

def decode(number)
  (number - 1).chr
end

def max_code
  127
end

end

Here we see the require for the code that we just examined. Note that
this code
will work fine with a mathn require as well though.

The class defined here is the simple encoding described in the quiz. As
the
comment indicates, pulling this code into the class makes it easy to
swap out
with other encoding schemes.

The work horse methods for the solution are encode() and decode(), of
course.
Here’s the easy one:

def encode(input, primes, coder)
goedel_value = 1

input.each_line do |line|
  0.upto(line.size - 1) do |i|
    char = line[i, 1]
    encoding = coder.encode char
    next if encoding.nil?  # skip characters without encoding
    goedel_value *= primes.next ** encoding
  end
end

puts goedel_value

end

The code works its way line by line and character by character through
the
input. Each character is encoded using the RubyQuizCoder class we saw
earlier,
used as an exponent for a prime based on its position, and finally
multiplied
into the overall Goedel value. When all of the characters have been
dealt with,
the overall value is printed as a result.

The reverse operation is harder to digest, because it’s where the
optimizations
are hiding:

Attempt to decode quickly by trying to perfectly divide by

prime**(26), prime(25), prime(24), …, prime(2**0) and

then adding the powers of 2 for which the division worked without a

remainder. For example, if a number were divisible by prime**101,

then it’s also divisible by prime64 * prime32 * prime**4 *

prime**1 since 64 + 32 + 4 + 1 = 101. So, we’ll have to divide the

large number exactly 7 times per prime no matter what the exponent.

Note: 7 assumes that the encoding results in no value greater than

127.

def decode(input, primes, coder)
goedel_value = input.gets.to_i
max_two_expnt = (Math.log(coder.max_code) / Math.log(2)).to_i
factors = (0…max_two_expnt).map { |i| [2**i, nil] }

while goedel_value > 1
  current_prime = primes.next
  encoded = 0

  factors[0][1] = current_prime
  (1..max_two_expnt).each do |i|
    factors[i][1] = factors[i - 1][1] ** 2
  end

  factors.reverse_each do |expnt, factor|
    quotient, remainder = goedel_value.divmod(factor)
    if remainder == 0
      encoded += expnt
      goedel_value = quotient
    end
  end

  char = coder.decode(encoded)
  putc char unless char.nil?
end

end

The biggest trick in here is the use of factorization to narrow down the
divisions needed. Until the overall value hits one, each prime is
pulled in
turn and factored into the possible divisors. Each of those numbers is
then
tried in reverse order. Those that divide evenly are added to the
encoded
character count and drop the overall count accordingly. After all of
the
factors have been tried, the character count is passed through our
RubyQuizCoder
object and the resulting character is printed.

This strategy results in a constant number of divisions for each
character and
in most cases, those divisions should be significantly less than the
brute force
approach.

The rest of the code just provides an interface to these routines:

def usage
STDERR.puts “Usage: %s -e[ncode]|-d[ecode] [file]” % $0
exit 1
end

process command-line args and figure out which method to call

task = nil
input = nil
ARGV.each do |arg|
case arg
when /^-+e/ : task = :encode
when /^-+d/ : task = :decode
else if input : usage
else input = open(arg)
end
end
end

input = STDIN if input.nil?
primes = Prime.new
coder = RubyQuizCoder.new

case task
when :encode : encode(input, primes, coder)
when :decode : decode(input, primes, coder)
else usage
end

This is just some basic argument parsing code. It hunts for a -e or -d
switch
to figure out if we are encoding or decoding. It also opens a file of
input for
the first non-switch argument or defaults to STDIN. Failing to select a
mode or
providing multiple input parameters triggers the usage message and an
exit()
call. Otherwise, the selected routine is called with the input, a prime
generator, and the coder.

356592611993533159357704171943707065506245018107843654420869995255779400
940327637098004636805658940369248254005741705095861927174094085632357462
907907043655503052325570584135928519171701028451925211424184061134912519
346213124822184698766352106671144335797309339814960788404357329829880747
370816749488671200451708065007982023372864606432786154466561150882383337
665739304198857199016703678239557259404453555953430721034149637848167371
696741427087919115033228735590823604859393603457456530272005167282069777
767057670186595718538513692305202904620672162211361327059254731120977063
134449493970743739621694338383775782689258152219852727482221819916188854
728922544056620806588370831318213885828889837200379775053853865140225412
662845287202726609004691108889366849057859320331241672049374448503639685
391414041859040062271216363732269502663137649053220830210776147948360695
030849925289805667675539825659032822205396840604174306471211962857755270
446984071785349996985198175348758225276208258987173267939423506126205306
366658339817220773597781191289736402029252868902666231417359731588146165
666320778680777303480755853750057521054497117887073508511901428397946093
449376833443738252027373639560059173037124844900113551258953525303486104
043722150604562552368672756555770801627883622940792114968157184742775843
032899157541783062241525730231786371568328435169052449375068965888786768
418148739617709373626622156165087821359521473101982925802046545744126160
160591686651704019368238740807898239650093187094408149981270808624486401
576666800438277327766454315294642692798530476745070050857038213692995319
240862517113963583832414299967798488646237585020392202957001143544708578
006399968115609032521439107622793677586436726234995703756213467526362754
053665223526567818472971465689078159087061330179433352620768373562074252
272173230499663950656487104976363578669988246793426315517241324669161225
95456614469899787029059315850805248000000000000000000000000000000000

Tomorrow we will …

On Nov 29, 2007, at 9:25 AM, Ruby Q. wrote:

Tomorrow we will …

Oops, forgot to fill that in. I guess it’s really a surprise this time…

James Edward G. II