Forum: Ruby Hangman (#130)

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
James G. (Guest)
on 2007-07-12 01:37
(Received via mailing list)
People wrote quite a bit of code for this quiz.  I'm guessing that's
because it
can be challenging to see where how your code is doing and to refine
that.

I'm going to show my own solution this time, not because it's any better
than
the others, but because I can talk you through my thinking in building
it.

The first thing in my code is a tiny library that handles dictionary
loading for
the guessing script and the test script.  I extracted this common
functionality
when I built my testing script:

  #!/usr/bin/env ruby -wKU

  WORDS_CASH_FILE = "words.cache"

  if File.exist? WORDS_CASH_FILE
    WORDS = File.open(WORDS_CASH_FILE) { |file| Marshal.load(file) }
  else
    WORDS = File.open( ARGV.find { |arg| arg =~ /\A[^-]/ } ||
                       "/usr/share/dict/words" ) do |dict|
      dict.inject(Hash.new) do |all, word|
        all.update(word.delete("^A-Za-z").downcase => true)
      end.keys.sort_by { |w| [w.length, w] }
    end
    File.open(WORDS_CASH_FILE, "w") { |file| Marshal.dump(WORDS, file) }
  end

There's nothing too fancy here.  Originally I just reread the dictionary
every
time, but that slowed down testing too much and I found caching it with
Marshal
helped.  I also started sorting the dictionary eventually to make it
easier for
when my testing was, "I've tried everything beneath four letters and I'm
in the
K's on those."

On to the guessing script.  Here's how that begins:

  #!/usr/bin/env ruby -wKU

  puts "One moment..."
  puts
  require "words"

  def frequency(words)
    freq = Hash.new(0)
    words.each do |word|
      word.split("").each { |letter| freq[letter] += word.count(letter)
}
    end
    freq
  end
  FREQ = frequency(WORDS).sort_by { |_, count| -count }.
                          map { |letter, _| letter }

  choices = WORDS
  guesses = Array.new

  # ...

This code prints a warning that loading the dictionary might take a
moment, then
requires the mini-library I just showed.  Once loaded, it defines a
method for
counting letter frequencies and runs it on the dictionary.

After that it initializes some state variables for the guessing.  The
script
decreases the word list as it works, so that is copied into a variable
to
reflect that it will change.  We then setup a way to keep track of our
guesses.

The next bit of code begins the main event loop:

  # ...

  loop do
    puts guesses.empty?                                       ?
         "Please enter a word pattern (_ _ _ _ for example):" :
         "Please update your pattern according to my guess "  +
         "(_ i _ _ for example):"
    $stdout.flush
    pattern = $stdin.gets.to_s.delete("^A-Za-z_")

    # ...

This code just asks the user for a pattern and retrieves what they give
us.  The
call to flush() is to support the test script, which we will see in a
bit.

    # ...

    bad_guesses = guesses - pattern.delete("_").split("")
    if bad_guesses.size > 5 and pattern.include? "_"
      puts "I'm out of guesses.  You win."
    elsif not pattern.include? "_"
      puts "I guessed your word.  Pretty smart, huh?"
    else
      choices = choices.grep(
                  bad_guesses.empty?             ?
                  /\A#{pattern.tr("_", ".")}\Z/i :
                  /\A(?!.*[#{bad_guesses.join}])#{pattern.tr("_",
".")}\Z/i
                )
      guess = frequency(choices).
                reject { |letter, _| guesses.include? letter }.
                sort_by { |letter, count| [-count, FREQ.index(letter)]
}.
                first.first rescue nil

      guesses << guess
      puts "I guess the letter '#{guess}'."
      puts
      next
    end

    # ...

This is the main body of this guessing script.  The first two branches
just
check for win and loss conditions, but the else branch is where the
logic is.

Before a selection is made, the dictionary is reduced by all words that
couldn't
possibly match.  After that, a frequency count is made for all remaining
possibilities.  Those possible letters are then sorted by that count and
how
common they are in this dictionary.  The letter that bubbles to the top
is our
choice.

This selection process works decently on larger words.  It really starts
to show
decent success rates at words four letters and longer.  Sadly, it fairs
poorly
with smaller words.

Once chosen a guess is printed for the user and the loop cycles to the
next
iteration.

When we fall through the above code we have reached an end condition.
The
following code sorts that out:

    # ...

    puts
    if ARGV.include? "--loop"
      choices = WORDS
      guesses = Array.new
    else
      break
    end
  end

Here I watch for a special command-line switch that puts the program
into a
continuous loop for testing.  This avoids reloading the dictionary and
thus is
faster.

I landed on my word choice strategy mainly by trial and error.  That was
made
possible by being able to run the algorithm over some words and watching
how
accurately it could pick them.  I did that by writing a simple script
that
pretend to be me.  It runs the game, feeds it words, and keeps score of
it's
progress.  Here's the start of that code:

  #!/usr/bin/env ruby -wKU

  require "words"

  results = Hash.new(0)
  at_exit do
    results[:total] = results[:right] + results[:wrong]
    puts
    puts   "Words:     #{results[:total]}"
    puts   "Guessed:   #{results[:right]}"
    puts   "Missed:    #{results[:wrong]}"
    printf "Accuracy:  %.2f%%\n", results[:right] / results[:total].to_f
* 100
    puts
  end
  trap("INT") { exit }

  # ...

Here I load the dictionary using the same code the guesser relies on.
Then I
build a Hash to hold guess counts and setup result printing for when the
program
exits.  This allows me to cancel test runs at anytime, but still see
results
thus far.  That was nice for spot checking results.  Sometimes I could
tell
right away that a change was better or worse.

Here's the actual testing code:

  # ...

  IO.popen( File.join(File.dirname(__FILE__), "hangman.rb --loop"),
            "r+" ) do |hangman|
    WORDS.each do |word|
      pattern = word.tr("a-z", "_")
      loop do
        input = String.new
        hangman.each do |line|
          input << line
          break if input =~ /^(?:I'm out|I guessed)|:\Z/
        end

        if input =~ /^I'm out/
          puts "It missed '#{word}'."
          results[:wrong] += 1
          break
        elsif input =~ /^I guessed/
          puts "It guessed '#{word}'."
          results[:right] += 1
          break
        elsif input =~ /^I guess the letter '(.)'/
          guess = $1
          word.split("").each_with_index do |letter, i|
            pattern[i, 1] = letter if letter == guess
          end
        end

        hangman.puts pattern
      end
    end
  end

This code just runs the guessing script just as a human would do.  It
feeds the
script words and watches the responses to see what happens.  You can
think of
this process as a poor man's expect.

Many of the other solutions achieved better results than my own, so be
sure to
look through them.  Some were quite long but most had decent
documentation about
their process.

My thanks to all who found the time to dig into this problem.

Tomorrow we have a classic computer science exercise and it's an easy
one, so
warm up your impress-people-with-a-concise-solution skills...
James G. (Guest)
on 2007-07-12 01:41
(Received via mailing list)
Sorry folks.  This was an early, unedited release I made by
accident.  I'll clean it up a bit on the Web site if you want to read
it there later.

On Jul 11, 2007, at 4:34 PM, Ruby Q. wrote:

> Tomorrow we have a classic computer science exercise and it's an
> easy one, so warm up your impress-people-with-a-concise-solution
> skills...

The "tomorrow" there refers to Friday, as usual.

James Edward G. II
This topic is locked and can not be replied to.