Word Loop (#149)

There were two kinds of solvers this week: hardcore programmers who
love a good
challenge and cheaters like me. Both approaches are neat, but I want to
focus
on the under appreciated cheater this time. Contrary to the best
practices for
gamblers, cheater is a wonderful label for a programmer to have.

Cheaters always find the system and make it work for them. So what’s
the system
this time? Have another look at one of the quiz examples:

i
p
p
Mis
ss
si

Okay, the first priority is to find a possible loop. We need a repeated
letter
with some letters between the two occurrences, obviously. How many
letters
between though? Well, the minimum loop is:

*.

The next smallest loop is what we see in Mississippi:

*.

That’s three and five and we can already see that each size must add two
more
letters. So, we need an odd number of letters and at least three of
those. Now
we can find loops.

Once we see the loop, it becomes clear that the loop divides the word
into
pieces. Let me call them out:

^ > = before the loop characters
^ * = the repeated letter
^ . = loop characters

*. ^ = after the loop characters

Seeing these makes a cheater wonder, can I output each part naturally
down the
lines? To do that, we would first need to spit out those after the loop
characters, in reverse order. That shouldn’t be tough since we already
know how
to find a loop, but notice that each of those is also indented. It
turns out
that they are just indented by the length of characters before the loop,
so
that’s trivial to deal with.

Now we would need the get those before the loop characters in the
output. No
problem there, it’s a simple print statement.

The loop is the trickiest part. First, we see that the repeated letter
and the
first loop character can be printed along with the before characters.
The rest
of the characters have an indent, but it’s the same thing we figured out
earlier
and we can deal with that. Then outputting the remaining characters of
the loop
can be as simple as printing two columns: one pulling letters off the
back of
the loop and the other pulling letters off the front.

Wow, cheaters work harder than you think, eh? If we can do all of those
pieces,
we can print the word loop as we go. The reason I like trying an
approach like
this is that each step is pretty simple and easy to understand. We had
to do a
bit of thinking that the computer probably could have done for us, but
then you
have to be smart enough to tell the computer how to do the thinking.

Let’s move the ideas into code at this point. We will examine Ken
Bloom’s
solution. Ken is a cheater and a better one than I am at that! I’ll
tell you
what Ken taught me shortly, but for first let’s see the code. Brace
yourself
because I’m giving you the whole thing in one shot:

def loopword word
matchinfo=
word.match(/(.?)(.)(.(?:…)+?)\2(.)/i)
if matchinfo
_,before,letter,looplets,after=matchinfo.to_a
pad=" "*before.size
after.reverse.split(//).each{|l| puts pad+l}
looplets=looplets.split(//)
puts before+letter+looplets.shift
until looplets.empty?
puts pad+looplets.pop+looplets.shift
end
else
puts “No loop.”
end
end

Isn’t it great to be a cheater?

Let’s break it down. Probably the hardest part to understand is the
very first
step. Ken hits the word with a tricky Regexp to figure out if it has a
loop in
it. That’s right, you can count an odd number of at least three
characters with
a regular expression. Let’s examine the pieces:

/ (.?) # characters before the loop, if any
(.) # the first appearance of our repeated character
(.(?:…)+?) # an odd number of at least three loop characters
\2 # the repeat of our repeated character
(.
) # characters after the loop, if any
/ix # make the expression case insensitive

There are two good tricks in there. First, matching at least three odd
characters is shown to be just any character followed by one or more
groups of
two characters. That’s handy to remember.

The other trick is using a back-reference to catch the repeated
character. What
I didn’t know about this trick though was that Ruby’s regular expression
engine
is smarter than I gave it credit for. I knew this would match:

“-i—i-”[/(\w).+\1/]
=> “i—i”

However, I didn’t know the following would work if you made the
expression case
insensitive:

“-I—i-”[/(\w).+\1/i]
=> “I—i”

I spent too much effort in my code to get a match to work on the word
Markham
when I could have just used the /i switch. Thanks for the tip, Ken.

Once we have tried the expression, the if statement checks to see if we
found a
loop. If we didn’t the else clause can print our error message and
we’re done.

When we did find a match, Ken starts by pulling out each capture of the
expression into easy to manage variables. Here’s my chance to teach Ken
a new
trick though. See how he created an unused variable (_) to capture the
match as
a whole? It’s not really needed if you switch the method call on the
MatchData
object:

md = “Mississippi”.match(/(.?)(.)(.(?:…)+?)\2(.)/i)
=> #MatchData:0x58a188

md.to_a
=> [“Mississippi”, “M”, “i”, “ssiss”, “ppi”]

md.captures
=> [“M”, “i”, “ssiss”, “ppi”]

Let’s get back to the code. Here it is again to save scrolling:

def loopword word
matchinfo=
word.match(/(.?)(.)(.(?:…)+?)\2(.)/i)
if matchinfo
_,before,letter,looplets,after=matchinfo.to_a
pad=" "*before.size
after.reverse.split(//).each{|l| puts pad+l}
looplets=looplets.split(//)
puts before+letter+looplets.shift
until looplets.empty?
puts pad+looplets.pop+looplets.shift
end
else
puts “No loop.”
end
end

The rest of the code prints the word as I described earlier. First, a
variable
is set to that indent we will need in multiple places. The next line
reverse()s
and split()s the after loop characters, printing each one with the
indent.
After that, Ken breaks up the loop characters into an Array for easy
removal at
both ends. The next line prints the before the loop characters, the
repeat, and
the first loop character. Finally, the until loop handles the remaining
loop
character two at a time, one from each end. That process prints the
entire word
and makes a complete solution.

The rest of Ken’s code just called the method on a set of sample words:

loopword “Mississippi”
puts
loopword “Markham”
puts
loopword “yummy”
puts
loopword “Dana”
puts
loopword “Organization”

The non-cheating solutions are also very interesting. They decided that
this
problem wasn’t hardcore enough for them and they could maximize the fun
by
trying to create multiple loops and reuse as many characters as
possible. It’s
great code that leads to insane output, so be sure to check those out as
well.

My thanks to cheaters and hardcore solvers alike. You always give me
more
interesting material than I can even talk about.

Tomorrow we will tackle a typical programming task in a completely
non-typical
manner…