B & e (#72)

bbazzarrakk · March 30, 2006, 6:46pm

This quiz turned out to be surprisingly tough. Ross B. seemed to
get the
tighter solution and it’s not as big a difference as you might think.
Here’s
some output from running Ross’s solution:

 ### 4 digits, [1,2,3] stops ###
      user     system      total        real
seq.     1.000000   0.020000   1.020000 (  1.021674)
Search space exhausted.
Tested 10000 of 10000 codes in 50000 keystrokes.
6146 codes were tested more than once.

seq/chk  0.790000   0.000000   0.790000 (  0.804365)
Search space exhausted.
Tested 10000 of 10000 codes in 33395 keystrokes.
2557 codes were tested more than once.

simple   2.840000   0.040000   2.880000 (  2.930310)
Search space exhausted.
Tested 10000 of 10000 codes in 33830 keystrokes.
2721 codes were tested more than once.

best   665.050000   5.390000 670.440000 (678.840230)
Search space exhausted.
Tested 10000 of 10000 codes in 28687 keystrokes.
464 codes were tested more than once.

Compare that with the trivial tests in the quiz code:

Search space exhausted.
Tested 10000 of 10000 codes in 33635 keystrokes.
2664 codes were tested more than once.

At “best”, Ross managed to trim 4,948 keystrokes and there was a hefty
time
penalty to pay for getting that far.

Let’s work through the code that amounts to the “best” search. First we
need to
examine a helper class it makes use of:

# This was originally a delegator, but that was *slow*
class PoolStr < String
  def initialize(obj, parent = nil)
    super(obj)
    @parent = parent
  end
  def suffixes(maxlen = nil)
    @suffixes ||= get_suffixes(maxlen).map do |s|
      PoolStr.new(s,self)
    end
  end
  def prefixes(maxlen = nil)
    @prefixes ||= get_suffixes(maxlen,reverse).map do |s|
      PoolStr.new(s.reverse,self)
    end
  end
  private
  def get_suffixes(maxlen = nil, str = self)
    start = maxlen ? str.length-maxlen : 1
    (start..(str.length-1)).map do |i|
      suf = str[i..-1]
      suf
    end
  end
end

There’s nothing too tricky hiding in here. Basically we have a String,
that can
give us an Array of suffixes or prefixes as needed. Those are both
found with
some simple index work in get_suffixes(). To handle prefixes, the
String is
reversed before it goes in, and the suffixes are reversed as they come
out to
transform them into prefixes. Note that both of these are cached the
first time
they are figured, to speed up future calls. (I’m not sure what @parent
is for
here. It doesn’t seem to be used.)

Next we need to look at one more short helper, which turns out to be the
heart
of the “seq.” search from the results:

# Make our codes. Using random stop digits gives a more
# compressible string (less stop digits = longer string).
def mkpool(digits, stops)
  (("0"*digits)..("9"*digits)).inject([]) do |ary,e|
    ary << PoolStr.new("#{e}#{stops[rand(stops.length)] if stops}",

nil)
end
end

Again, nothing tricky here. Assuming a parameter of 4 for digits, this
builds
an Array of the codes from “0000” to “9999”, with a random stop digit at
the end
of each one.

Now we are ready for the main method of the “best” search:

# A more involved way, a simplified greedy heuristic
# that takes _forever_ but gives (slightly) better results.
def best_code(digits, stops = [1,2,3])
  out = ""
  pool = mkpool(digits, stops)
  best = []
  bestcode = nil

  # Keep going until it's empty - if ever we can't find a match
  # we'll just take one at random.
  until pool.empty?
    unless bestcode
      # first iteration, just grab a start and carry on
      bestscore = 0
      bestcode = pool[rand(pool.length)]
    else
      # Get the array of arrays of best matches for the previous code.
      # This array matches suffixes to best-fit prefixes for
      # the previously-added code to find the most mergeable code
      # (i.e. the one that shares most of it's prefix with the end
      # of the output string).
      #
      # This contains at a given index all the codes that share that
      # many letters of pre/suffix with 'need'. Eh? Well, it's like

this:
#
# best for “1234” => [ nil, [“4321”, “4048”], [“3412”],
[“2348”]]
#
# Then, (one of) the highest scoring code(s) is taken from
# the top of the last nested array, best[best.length-1].
#
# We do it this way, rather than reversing the array, because
# we need to know what the actual score was, to merge the
# strings. Running on each iteration helps a bit
# with performance, since as time goes on the number of
# elements decreases.
best.clear
pool.each do |nxt|
next if nxt == bestcode
if r = (bestcode.suffixes & nxt.prefixes).first
(best[r.length] ||= []) << nxt
end
end

      bestcode = (best[bestscore = best.length - 1] ||

EMPTYARRAY).first

      # No best code? Nothing with matching pre/suff, so let's just

grab
# a code at random instead to keep things moving.
unless bestcode
bestscore = 0
bestcode = pool[rand(pool.length)]
end
end

    # Remove from the pool. Bit slow...
    pool[pool.index(bestcode),1] = nil

    # Chop off matched prefix from this code and append it
    merged = bestcode[bestscore..-1]
    out << merged
  end
  out
end

Here’s the beast, but the comments are good and they will get us through
it.
First you can see that the method makes a pool of all the codes, using
the
mkpool() method we just examined. Now anytime this method doesn’t have
a
bestcode, like right at the beginning here, it just pulls one at random
from the
pool. (You can see the code for this in the first unless clause and at
the end
of the big else clause.)

The real work is done right after that big comment. The pool is walked
comparing prefixes of possible codes with the suffixes of the bestcode
(our last
find). The matching groups are placed in an Array where the index is
their
score or how many digits they share. The code then just chooses a match
with
the highest score so it shares the most possible digits. (The
EMPTYARRAY
constant used here is defined elsewhere in the code as EMPTYARRAY = [].)

From there the code merges the suffix of the selected code (we already
have the
matching prefix from the last code), and adds it to the digits to enter.
It
then loops until the pool is exhausted.

The actual solution is just one line from that:

best_code(n).split(//).each { |c| a.press c.to_i }

You saw the results of that at the beginning of this summary. You would
have to
examine usage for this code carefully though, to determine if shaving
around
5,000 keystrokes is worth the unfortunately large speed penalty. It’s
all about
tradeoffs.

A huge thank you to the quiz creator for a challenging little problem
and the
two brave souls who were equal to the task.

Tomorrow’s quiz is still up in the air! We’re trying to sneak in a
research
project we can all help with, if we can get it ready in time…

bbazzarrakk · March 31, 2006, 9:07am

On Fri, 2006-03-31 at 01:43 +0900, Ruby Q. wrote:

This quiz turned out to be surprisingly tough. Ross B. seemed to get the
tighter solution and it’s not as big a difference as you might think.

Thanks both James and Stephen for a good quiz, this one really got me
thinking I knew my solution was slow, and at least I can now prove it

check out Timothy B.'s solution (which I think just missed the
summary, unfortunately) - it kicks my solution’s ass on performance and
comes up with similar (even slightly shorter) codes. Good work, Tim!

bbazzarrakk · March 31, 2006, 3:55pm

On Mar 31, 2006, at 1:05 AM, Ross B. wrote:

check out Timothy B.'s solution (which I think just missed the
summary, unfortunately)

Yes, I was done with the summary when this came in. It is very
nice though.

James Edward G. II

bbazzarrakk · March 31, 2006, 6:25pm

On Mar 31, 2006, at 5:54 AM, James Edward G. II wrote:

It is very nice though.

Indeed, as were all of the solutions. Thanks everyone for
participating!

–Steve