Forum: Ruby Word Blender (#108)

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
James G. (Guest)
on 2007-01-11 15:42
(Received via mailing list)
I'm almost embarrassed to admit that I originally turned this quiz down.
I
thought it would be too similar to the Scrabble Stems problem we did
long, long
ago.  Ben politely explained that he felt it was different enough
though, and
then sent some guy named Guido after me in a parking lot one night.
That
convinced me to actually work the problem, and I had a change of heart.
I think
we can tell from the popularity of the problem that Ben is smarter than
I am, so
I'm glad I did.

Many solvers chose to build the entire game, so we will take a look at
that
approach.  Though the quiz doesn't explicitly call for it, most
programmers
chose to add scoring or a time limit to their solution to spice up the
action.
I went with the time limit and we will examine my own code below.

The first step is to get a word list, of course.  There were several
approaches
to this process, since manipulating every word in the dictionary each
time could
get a little slow.  My answer to this was just to cache the word list
after I
had built it once and reuse that for all future runs.  Here's the code
that
handles the loading and caching:

  # game date cache
  CACHE_FILE = ".game_words"

  if File.exist? CACHE_FILE  # load from cache
    word_list = File.open(CACHE_FILE) { |file| Marshal.load(file) }
  else                       # build word list
    # prepare data structure
    words_by_signature = Hash.new { |words, sig| words[sig] = Array.new
}

    # read dictionary
    File.foreach(ARGV.shift || "/usr/share/dict/words") do |word|
      word.downcase!
      word.delete!("^a-z")

      next unless word.length.between? 3, 6

      (words_by_signature[word.split("").sort.join] << word).uniq!
    end

    # prepare recursive signature search
    def choices( sig,
                 seen = Hash.new { |all, cur| all[cur] = true; false },
                 &blk )
      sig.length.times do |i|
        shorter = sig[0...i] + sig[(i+1)...sig.length]
        unless seen[shorter]
          blk[shorter]
          choices(shorter, seen, &blk) unless shorter.length == 3
        end
      end
    end

    # prepare game data structure
    word_list = Hash.new

    # build game choices
    words_by_signature.keys.grep(/\A.{6}\Z/) do |possible|
      word_list[possible] = words_by_signature[possible]

      choices(possible) do |shorter_signature|
        if words_by_signature.include? shorter_signature
          word_list[possible].push(*words_by_signature[shorter_signature])
        end
      end
    end

    # cache for faster loads
    File.open(CACHE_FILE, "w") { |file| Marshal.dump(word_list, file) }
  end

  # ...

This uses Marshal to build a trivial word cache with only a few lines of
code.
If the cache exists, we slurp it back in.  Otherwise we build the cache
and save
it for future runs.

To build the cache, I pluck all three to six letter words out of the
indicated
dictionary file and build a word list containing all six letter words
linked to
smaller words using the same letters.

The main trick used in this recursive grouping of words is the use of
"signatures."  A word's signature is just the sorted order of the
letters in the
word:  aejms for james, for example.  Comparing signatures makes it
trivial to
find words that use the same letters, since their signatures will be the
same.

Using the signatures, the choices() method just removes one character at
a time
recursing through the word list.  This allows me to find all of the
smaller
words that can be formed using the same letters.

If you wanted to generate the entire list of games as the quiz
suggested, the
above is all you need.  Each key is one possible round with the values
being the
words that can be matched in that round.

I wanted to play though and built a full game interface.  My interface
requires
Unix, because those are the tricks I know.  Here's the start of that
code:

  # ...

  require "io/wait"

  ### game interface (requires Unix) ###
  TERMINAL_STATE = `stty -g`
  system "stty raw -echo cbreak"
  at_exit { system "stty #{TERMINAL_STATE}" }
  clear = `clear`

  # a raw mode savvy puts
  def out(*args) print(*(args + ["\r\n"])) end

  # for easy selection
  words = word_list.keys

  # ...

This setup code memorizes the original state of the user's terminal,
modifies
that state to raw mode so I can read individual characters as they are
pressed,
arranges to have the terminal settings restored at exit, grabs the
escape
sequence we can use to clear the terminal, and builds a puts() like
method that
works with raw mode.  This code doesn't really have much to do with
Ruby.  I'm
just shelling out to standard Unix utilities here.

We're now ready for the game event loop:

  # ...

  rounds = 0
  loop do
    # select letters
    letters = current = words[rand(words.size)]
    while word_list.include? letters
      letters = letters.split("").sort_by { rand }.join
    end
    letters.gsub!(/.(?=.)/, '\0 ')

    # round data
    advance       = false
    matches       = Array.new
    current_match = String.new
    start         = Time.now
    message       = nil
    last_update   = start - 1

    # ...

I begin by selecting a word for the round and scrambling that word's
letters
until they are no longer a real word.  Then I setup some variables to
hold data
for the round like whether or not the user has found a six letter word
and
should advance as well as any feedback message I am currently showing
the user
and the last time I refreshed the screen.

Now we start the round event loop:

    # ...

    # round event loop
    until Time.now >= start + 2 * 60
      # game display
      if last_update <= Time.now - 1
        print clear

        out "Your letters:  #{letters}"
        out "   Time left:  #{120 - (Time.now - start).round} seconds"
        out "  Your words:  #{matches.join(', ')}"
        out
        unless message.nil?
          out message
          out
        end
        print current_match
        $stdout.flush

        last_update = Time.now
      end

      # ...

The round event loop runs for two minutes and this first bit is
responsible for
drawing the screen.  After clearing the screen, it prints the letters,
time
left, words found, and any feedback message to the screen.  Note that I
update
the screen each second, assuming no other activity, so the user will
notice the
clock counting down.

Here's the input processing:

      # ...

      # input handler
      if $stdin.ready?
        char = $stdin.getc
        case char
        when ?a..?z, ?A..?Z  # read input
          current_match << char.chr.downcase
          message       =  nil
          last_update   =  start - 1
        when ?\b, 127        # backspace/delete
          current_match = current_match[0..-2]
          message       =  nil
          last_update   =  start - 1
        when ?\r, ?\n        # test entered word
          if word_list[current].include? current_match
            matches << current_match
            matches = matches.sort_by { |word| [word.size, word] }
            if not advance and current_match.length == 6
              advance = true
              message = "You will advance to the next round!"
            else
              message = nil
            end
          else
            message = "Unknown word."
          end
          current_match = String.new
          last_update   = start - 1
        end
      end
    end

    # ...

This just checks to see if there is data waiting on STDIN.  We don't
want to
read from it without checking as that could block the application
waiting for
input.  The ready?() method used here is added by the io/wait library
and will
return true if there is data waiting.  The rest of the code just handles
the
input we get.  Letters are recorded, we support backspace, and a
carriage-return
tells us to try the current word, set some feedback, and refresh the
display.

When the round is done, we finish out the game loop:

    # ...

    # round results
    print clear
    missed = word_list[current] - matches
    unless missed.empty?
      out "Other words using \"#{letters}:\""
      out missed.sort_by { |word| [word.size, word] }.join(", ")
      out
    end
    if advance
      rounds += 1

      out "You made #{matches.size} word#{'s' if matches.size != 1}, ",
          "including at least one six letter word.  Nice work."
      out "Press any key to begin the next round."

      $stdin.getc
    else
      out "You made #{matches.size} word#{'s' if matches.size != 1}, ",
          "but failed to find a six letter word."

      break  # end game
    end
  end

  # game results
  out "You completed #{rounds} round#{'s' if rounds != 1}.  ",
      "Thanks for playing."

The above code just prints missed words and results.  If you found a six
letter
word, the code will loop to a new round.  Otherwise it will break out of
the
program.

A big thank you to Ben (and Guido!) for convincing me to try the quiz
and to all
those that took the time to play with it.

Tomorrow we'll try a problem that has been making the rounds,
literally...
Martin DeMello (Guest)
on 2007-01-11 15:54
(Received via mailing list)
On 1/11/07, Ruby Q. <removed_email_address@domain.invalid> wrote:
>
> The main trick used in this recursive grouping of words is the use of
> "signatures."  A word's signature is just the sorted order of the letters in the
> word:  aejms for james, for example.  Comparing signatures makes it trivial to
> find words that use the same letters, since their signatures will be the same.
>
> Using the signatures, the choices() method just removes one character at a time
> recursing through the word list.  This allows me to find all of the smaller
> words that can be formed using the same letters.

I used a different signature method - I mapped each unique letter in
the target word to a prime, and then used the product of the primes of
all the letters in a word as its signature. That way, a is contained
in b if signature(b) % signature(a) == 0, and you can generate the
signatures via each_byte, array lookup and integer multiplication (no
need for split, sort or string deletion).

martin
James G. (Guest)
on 2007-01-11 15:59
(Received via mailing list)
On Jan 11, 2007, at 7:54 AM, Martin DeMello wrote:

>> Using the signatures, the choices() method just removes one
> need for split, sort or string deletion).
Very interesting.  I've never seen that before.  I like it.

My version of "signatures" comes out of Programming Peals, if my
memory is right.  Just FYI.

James Edward G. II
Ben B. (Guest)
on 2007-01-18 01:02
(Received via mailing list)
On Thu, Jan 11, 2007, Ruby Q. wrote:
> I'm almost embarrassed to admit that I originally turned this quiz down.  I
> thought it would be too similar to the Scrabble Stems problem we did long, long
> ago.  Ben politely explained that he felt it was different enough though, and
> then sent some guy named Guido after me in a parking lot one night.  That
> convinced me to actually work the problem, and I had a change of heart.  I think
> we can tell from the popularity of the problem that Ben is smarter than I am, so
> I'm glad I did.

Coupla things:

1) Gianni, not Guido.  I can understand the confusion, though, as it was
   dark and probably hurt a bunch.

2) I didn't manage to write a solution to the quiz, despite the fact
that
   I had at least a month's time more than most everyone else, so none
of
   this smarter business. :)

> I wanted to play though and built a full game interface.  My interface requires
> Unix, because those are the tricks I know.  Here's the start of that code:

<snip>

> This setup code memorizes the original state of the user's terminal, modifies
> that state to raw mode so I can read individual characters as they are pressed,
> arranges to have the terminal settings restored at exit, grabs the escape
> sequence we can use to clear the terminal, and builds a puts() like method that
> works with raw mode.  This code doesn't really have much to do with Ruby.  I'm
> just shelling out to standard Unix utilities here.

<snip awesomeness>

Further proof of #2 above.

Ben
Ben B. (Guest)
on 2007-01-19 17:30
(Received via mailing list)
On Thu, Jan 11, 2007, Martin DeMello wrote:
> I used a different signature method - I mapped each unique letter in
> the target word to a prime, and then used the product of the primes of
> all the letters in a word as its signature. That way, a is contained
> in b if signature(b) % signature(a) == 0, and you can generate the
> signatures via each_byte, array lookup and integer multiplication (no
> need for split, sort or string deletion).

Holy crap.  That is incredibly cool.  Thanks for sharing this technique!

Ben
Martin DeMello (Guest)
on 2007-01-19 17:31
(Received via mailing list)
On 1/11/07, James Edward G. II <removed_email_address@domain.invalid> wrote:
> On Jan 11, 2007, at 7:54 AM, Martin DeMello wrote:
> > I used a different signature method - I mapped each unique letter in
> > the target word to a prime, and then used the product of the primes of
> > all the letters in a word as its signature. That way, a is contained
> > in b if signature(b) % signature(a) == 0, and you can generate the
> > signatures via each_byte, array lookup and integer multiplication (no
> > need for split, sort or string deletion).
>
> Very interesting.  I've never seen that before.  I like it.

The numbers overflow 32 bits in the general case, I think, but for
this restricted problem it works very nicely.

martin
Fedor L. (Guest)
on 2007-09-26 00:40
(Received via mailing list)
On 1/11/07, Martin DeMello <removed_email_address@domain.invalid> wrote:
> > Very interesting.  I've never seen that before.  I like it.
>
> The numbers overflow 32 bits in the general case, I think, but for
> this restricted problem it works very nicely.
>
> martin
>
>
I believe with Ruby they'll get converted to Bignum automatically with
no
overflow, though I bet the running time suffers.
Phrogz (Guest)
on 2007-09-26 00:45
(Received via mailing list)
Martin DeMello wrote:
>
> The numbers overflow 32 bits in the general case, I think, but for
> this restricted problem it works very nicely.

Specifically (I was wondering) you can use 9 primes and still be under
32 bits, 15 primes and still be under 64 bits.

module Enumerable
  def product; inject(){ |n,p| p*n }; end
end

first9 = primes[0..8]
puts "First 9 primes:\n%s\nproduct: %d (%d bits)" %
  [
    first9.inspect,
    first9.product,
    first9.product.to_s(2).length
  ]

first15 = primes[0..14]
puts "\nFirst 15 primes:\n%s\nproduct: %d (%d bits)" %
  [
    first15.inspect,
    first15.product,
    first15.product.to_s(2).length
  ]

#=> First 9 primes:
#=> [2, 3, 5, 7, 11, 13, 17, 19, 23]
#=> product: 223092870 (28 bits)

#=> First 15 primes:
#=> [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47]
#=> product: 614889782588491410 (60 bits)

25 primes puts you at 121 bits; 26 primes at 128 bits on the nose.
This topic is locked and can not be replied to.