Hangman (#130)


#1

People wrote quite a bit of code for this quiz. I’m guessing that’s
because it
can be challenging to see where how your code is doing and to refine
that.

I’m going to show my own solution this time, not because it’s any better
than
the others, but because I can talk you through my thinking in building
it.

The first thing in my code is a tiny library that handles dictionary
loading for
the guessing script and the test script. I extracted this common
functionality
when I built my testing script:

#!/usr/bin/env ruby -wKU

WORDS_CASH_FILE = “words.cache”

if File.exist? WORDS_CASH_FILE
WORDS = File.open(WORDS_CASH_FILE) { |file| Marshal.load(file) }
else
WORDS = File.open( ARGV.find { |arg| arg =~ /\A[^-]/ } ||
“/usr/share/dict/words” ) do |dict|
dict.inject(Hash.new) do |all, word|
all.update(word.delete("^A-Za-z").downcase => true)
end.keys.sort_by { |w| [w.length, w] }
end
File.open(WORDS_CASH_FILE, “w”) { |file| Marshal.dump(WORDS, file) }
end

There’s nothing too fancy here. Originally I just reread the dictionary
every
time, but that slowed down testing too much and I found caching it with
Marshal
helped. I also started sorting the dictionary eventually to make it
easier for
when my testing was, “I’ve tried everything beneath four letters and I’m
in the
K’s on those.”

On to the guessing script. Here’s how that begins:

#!/usr/bin/env ruby -wKU

puts “One moment…”
puts
require “words”

def frequency(words)
freq = Hash.new(0)
words.each do |word|
word.split("").each { |letter| freq[letter] += word.count(letter)
}
end
freq
end
FREQ = frequency(WORDS).sort_by { |_, count| -count }.
map { |letter, _| letter }

choices = WORDS
guesses = Array.new

This code prints a warning that loading the dictionary might take a
moment, then
requires the mini-library I just showed. Once loaded, it defines a
method for
counting letter frequencies and runs it on the dictionary.

After that it initializes some state variables for the guessing. The
script
decreases the word list as it works, so that is copied into a variable
to
reflect that it will change. We then setup a way to keep track of our
guesses.

The next bit of code begins the main event loop:

loop do
puts guesses.empty? ?
“Please enter a word pattern (_ _ _ _ for example):” :
“Please update your pattern according to my guess " +
“(_ i _ _ for example):”
$stdout.flush
pattern = $stdin.gets.to_s.delete(”^A-Za-z_")

# ...

This code just asks the user for a pattern and retrieves what they give
us. The
call to flush() is to support the test script, which we will see in a
bit.

# ...

bad_guesses = guesses - pattern.delete("_").split("")
if bad_guesses.size > 5 and pattern.include? "_"
  puts "I'm out of guesses.  You win."
elsif not pattern.include? "_"
  puts "I guessed your word.  Pretty smart, huh?"
else
  choices = choices.grep(
              bad_guesses.empty?             ?
              /\A#{pattern.tr("_", ".")}\Z/i :
              /\A(?!.*[#{bad_guesses.join}])#{pattern.tr("_", 

“.”)}\Z/i
)
guess = frequency(choices).
reject { |letter, _| guesses.include? letter }.
sort_by { |letter, count| [-count, FREQ.index(letter)]
}.
first.first rescue nil

  guesses << guess
  puts "I guess the letter '#{guess}'."
  puts
  next
end

# ...

This is the main body of this guessing script. The first two branches
just
check for win and loss conditions, but the else branch is where the
logic is.

Before a selection is made, the dictionary is reduced by all words that
couldn’t
possibly match. After that, a frequency count is made for all remaining
possibilities. Those possible letters are then sorted by that count and
how
common they are in this dictionary. The letter that bubbles to the top
is our
choice.

This selection process works decently on larger words. It really starts
to show
decent success rates at words four letters and longer. Sadly, it fairs
poorly
with smaller words.

Once chosen a guess is printed for the user and the loop cycles to the
next
iteration.

When we fall through the above code we have reached an end condition.
The
following code sorts that out:

# ...

puts
if ARGV.include? "--loop"
  choices = WORDS
  guesses = Array.new
else
  break
end

end

Here I watch for a special command-line switch that puts the program
into a
continuous loop for testing. This avoids reloading the dictionary and
thus is
faster.

I landed on my word choice strategy mainly by trial and error. That was
made
possible by being able to run the algorithm over some words and watching
how
accurately it could pick them. I did that by writing a simple script
that
pretend to be me. It runs the game, feeds it words, and keeps score of
it’s
progress. Here’s the start of that code:

#!/usr/bin/env ruby -wKU

require “words”

results = Hash.new(0)
at_exit do
results[:total] = results[:right] + results[:wrong]
puts
puts “Words: #{results[:total]}”
puts “Guessed: #{results[:right]}”
puts “Missed: #{results[:wrong]}”
printf “Accuracy: %.2f%%\n”, results[:right] / results[:total].to_f

  • 100
    puts
    end
    trap(“INT”) { exit }

Here I load the dictionary using the same code the guesser relies on.
Then I
build a Hash to hold guess counts and setup result printing for when the
program
exits. This allows me to cancel test runs at anytime, but still see
results
thus far. That was nice for spot checking results. Sometimes I could
tell
right away that a change was better or worse.

Here’s the actual testing code:

IO.popen( File.join(File.dirname(FILE), “hangman.rb --loop”),
“r+” ) do |hangman|
WORDS.each do |word|
pattern = word.tr(“a-z”, “_”)
loop do
input = String.new
hangman.each do |line|
input << line
break if input =~ /^(?:I’m out|I guessed)|:\Z/
end

    if input =~ /^I'm out/
      puts "It missed '#{word}'."
      results[:wrong] += 1
      break
    elsif input =~ /^I guessed/
      puts "It guessed '#{word}'."
      results[:right] += 1
      break
    elsif input =~ /^I guess the letter '(.)'/
      guess = $1
      word.split("").each_with_index do |letter, i|
        pattern[i, 1] = letter if letter == guess
      end
    end

    hangman.puts pattern
  end
end

end

This code just runs the guessing script just as a human would do. It
feeds the
script words and watches the responses to see what happens. You can
think of
this process as a poor man’s expect.

Many of the other solutions achieved better results than my own, so be
sure to
look through them. Some were quite long but most had decent
documentation about
their process.

My thanks to all who found the time to dig into this problem.

Tomorrow we have a classic computer science exercise and it’s an easy
one, so
warm up your impress-people-with-a-concise-solution skills…


#2

Sorry folks. This was an early, unedited release I made by
accident. I’ll clean it up a bit on the Web site if you want to read
it there later.

On Jul 11, 2007, at 4:34 PM, Ruby Q. wrote:

Tomorrow we have a classic computer science exercise and it’s an
easy one, so warm up your impress-people-with-a-concise-solution
skills…

The “tomorrow” there refers to Friday, as usual.

James Edward G. II