One-Liners (#113)

If you followed the solutions of this quiz you should have seen a little
bit of
everything. We saw clever algorithms, Ruby idioms, some golfing, and
even a
mistake or two. Follow along and I’ll give you the highlights.

The problems of adding commas to numbers, shuffling Arrays, and
resolving class
names were selected because I see them pretty regularly on Ruby T…
Because
of that, I figured most of us would know those idioms pretty well.
There were a
couple of surprises though, so I’m glad I decided to include them.

For adding commas to numbers, several of us used some variant of a
pretty famous
reverse(), use a regular expression, and reverse() trick. Here’s one
such
solution by Carl P.:

quiz.to_s.reverse.scan(/(?:\d*.)?\d{1,3}-?/).join(’,’).reverse

I’ve always liked this problem and this trick to solve it, because it
reminds me
of one of my favorite rules of computing: when you’re hopelessly stuck,
reverse
the data. I can’t remember who taught me that rule now and I have no
earthly
idea why it works, but it sure helps a lot.

Take this problem for example. You need to find numbers in groups of
three and
it’s natural to turn to regular expressions for this. If you attack the
data
head-on though, it’s a heck of a problem. The left-most digit group
might be
one, two, or three long, and you have to be aware of that decimal that
ends
processing. It’s a mess, but one call to reverse() cleans it right up.

Now the decimal will be before the section we want to work with and, as
we see
in Carl’s code, you can skip right over it. From there the digit groups
will
line up perfectly as long as you always try to greedily grab three or
less.
Carl’s code does this, picking the String apart with scan() and then
join()ing
the groups with added commas. I think it’s clever, elegant, and
Rubyish.

I’m serious about that reverse()ing the data trick too. Try it out next
time
you are struggling.

Shuffling Arrays surprised me. More that one person sent in:

quiz.sort{rand}

Yikes!

Does that even work? Let’s ask IRb:

quiz = (1…10).to_a
=> [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

quiz.sort{rand}
=> [10, 6, 1, 7, 3, 8, 5, 9, 4, 2]

quiz.sort{rand}
=> [10, 6, 1, 7, 3, 8, 5, 9, 4, 2]

quiz.sort{rand}
=> [10, 6, 1, 7, 3, 8, 5, 9, 4, 2]

quiz.sort{rand}
=> [10, 6, 1, 7, 3, 8, 5, 9, 4, 2]

quiz.sort{rand}
=> [10, 6, 1, 7, 3, 8, 5, 9, 4, 2]

That’s not looking too random to me.

Let’s think about this. What does the above code do. sort() compares
elements
of the Array, arranging them based on the returned result. We are
suppose to
return a result of -1, 0, or 1 to indicate how the elements compare.
However,
rand() returns a float between 0.0 and 1.0. Ruby considers anything
over 0.0 to
be the 1 response, so most of the rand calls give this. You can get a
0.0
result from time to time, but it will be a loner in a sea of 1s.

So what is the above code actually trying to do? It’s trying to compare
a
selection of random numbers and sort on that instead. Writing the
process out
longhand it is:

quiz.map { |e| [rand, e] }.sort.map { |arr| arr.last }

We change the elements into Arrays of random numbers and the element
itself. We
then sort() those. Arrays compare themselves element by element, so
they will
always start by comparing the random numbers. Then we just drop the
random
numbers back out of the equation.

Luckily, Ruby has a shorthand version of this process, known as the
Schwartzian
Transform:

quiz.sort_by { rand }

That’s the popular Ruby idiom for randomizing an Array. Make sure you
use
sort_by() instead of sort() when that’s your intent.

Resolving class names is a surprisingly complex issue. Classes can be
nested
inside other classes and modules, as the quiz example showed. Inside
that
nested scope we don’t need to use the full name of the constant either.
Finally, don’t forget things like autoload()ing and const_missing()
which
further complicate the issue.

I’ll probably get hate mail for this, but if you want to handle all of
those
cases with one easy bit of code I recommend this “cheating” solution
from
Phrogz:

eval(quiz)

This asks Ruby to lookup the constant(s) and she will always remember to
handle
all of the edge cases for us. I know we always say eval() is evil and
you
should avoid it, but this instance can be one of the exceptions, in my
opinion.
Of course, you must sanitize the input if you are taking it from a user
to
ensure they don’t sneak in any scary code, but even with that added
overhead
it’s still easier and more accurate than trying to do all the work
yourself.

If you just can’t get over the eval() call though, you can use something
like:

quiz.split("::").inject(Object) { |par, const| par.const_get(const) }

This doesn’t address all of the edge cases, but it often works as long
as you
are working with fully qualified names.

Skipping ahead a bit, let’s talk about the last three questions in the
quiz.
First, reading a random line from a file. This one is just a fun
algorithm
problem. Alex Y. offered this solution:

(a=quiz.readlines)[rand(a.size)]

That pulls all the lines into an Array and randomly selects one. You
can even
eliminate the assignment to the Array as Aleksandr Lossenko does:

quiz.readlines[rand(quiz.lineno)]

Same thing, but here we get the line count from the File object and thus
don’t
need a local variable.

The problem with both of these solutions is when we run them on a very
large
file. Slurping all of that data into memory may prove to be too much.

You could do it without slurping by reading the File twice. You could
read the
whole File to get a line count, choose a random line, then read back to
that
line. That’s too much busy work though.

There is an algorithm for reading the File just once and coming out of
it with a
random line. I sent the Ruby version of this algorithm in as my
solution:

quiz.inject { |choice, line| rand < 1/quiz.lineno.to_f ? line : choice
}

The trick is to select a line by random chance, based on the number of
lines
we’ve read so far. The first line we will select 100% of the time. 50%
of the
time we will then replace it with the second line. 33.3% of the time we
will
replace that choice with the third line. Etc. The end result will be
that we
have fairly selected a random line just by reading through the File
once.

The wondrous number problem was more an exploration of Ruby’s syntax
than
anything else. I used:

Hash.new { |h, n| n == 1 ? [1] : [n] + h[n % 2 == 0 ? n/2 : n*3+1]
}[quiz]

This is really an abuse of Ruby’s Hash syntax though. I don’t ever
actually
store the values in the Hash since that would be pointless for this
problem.
Instead I am using a Hash as a nested lambda(), clarified by this
translation
from Ken B.:

(h=lambda {|n| n==1 ? [1] : [n] + h[n%2 == 0 ? n/2 : n*3+1] })[quiz]

A solution to this problem probably belongs on more than one line though
as
these both feel like golfing to me. I liked this two step offering from
Aleksandr Lossenko:

a=[quiz]; a << (a.last%2==1 ? a.last*3+1 : a.last/2) while a.last!=1

The final question, about nested Hashes, is actually what inspired me to
make
this quiz. The question was raised recently on the Ruport mailing list
and it
took Gregory B. and myself working together a surprising amount of
time to
land on a solution just like this one from Carl P.:

quiz.reverse.inject { |mem, var| {var => mem} }

Those of you who figured that out quickly deserve a pat on the back.
You are
smarter than me.

Again we see my favorite trick of reverse()ing the data. This time
though, the
mental block for me was figuring out that it’s easier if you don’t
initialize
the inject() block. This causes inject() to start the mem variable as
the first
String in the Array, eliminating the lone-entry edge case. The problem
is
trivial from there, but that was a counter-intuitive leap for my little
brain.

I’ll leave you to glance over the other four problems on your own, but
do give
them a look. There was no shortage of great material this week.

My thanks to all of you who just couldn’t stop fiddling with these
problems,
generating great ideas all the while. Yes, I’m talking to you Robert
and Ken.

Tomorrow we’ve got a problem for all you Bingo nuts out there…

On Thu, 15 Feb 2007, Ruby Q. wrote:

quiz.sort{rand}

Does that even work? Let’s ask IRb:

quiz = (1…10).to_a
=> [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

quiz.sort{rand}
=> [10, 6, 1, 7, 3, 8, 5, 9, 4, 2]

quiz.sort{rand}
=> [10, 6, 1, 7, 3, 8, 5, 9, 4, 2]
[…]
So what is the above code actually trying to do? It’s trying to compare a
selection of random numbers and sort on that instead. Writing the process out
longhand it is:

quiz.map { |e| [rand, e] }.sort.map { |arr| arr.last }

Hmmm… Not being aware of sort_by before, why this complicated? What’s
wrong with

quiz.sort{rand-0.5}

?

(which puts your random numbers from their range of 0.0-1.0 to a new
range
of -0.5 and 0.5 – if you find it aesthetically more pleasing, you
could
of course use

quiz.sort{2*rand-1.0}

to make the range -1.0…1.0.

quiz.sort{rand-0.5}
=> [4, 2, 3, 9, 7, 8, 1, 6, 5, 10]

quiz.sort{rand-0.5}
=> [9, 6, 2, 5, 1, 4, 3, 8, 7, 10]

quiz.sort{rand-0.5}
=> [3, 8, 10, 6, 9, 5, 1, 7, 2, 4]

quiz.sort{rand-0.5}
=> [4, 7, 10, 1, 9, 5, 2, 8, 6, 3]

quiz.sort{rand-0.5}
=> [8, 5, 10, 9, 2, 1, 7, 3, 4, 6]

quiz.sort{rand-0.5}
=> [6, 9, 2, 8, 1, 7, 4, 10, 3, 5]

quiz.sort{rand-0.5}
=> [3, 4, 5, 10, 7, 1, 8, 6, 2, 9]

It does seem to find your “random” criteria, is shorter and uses less
calls…

On Feb 15, 2007, at 8:37 AM, Benedikt H. wrote:

=> [10, 6, 1, 7, 3, 8, 5, 9, 4, 2]
be the 1 response, so most of the rand calls give this. You can

Hmmm… Not being aware of sort_by before, why this complicated?
What’s wrong with

quiz.sort{rand-0.5}

?

In the above example I was trying to show what sort_by() does behind
the scenes. In truth, you should just be using the simple:

quiz.sort_by { rand }

quiz.sort{rand-0.5}
=> [6, 9, 2, 8, 1, 7, 4, 10, 3, 5]

quiz.sort{rand-0.5}
=> [3, 4, 5, 10, 7, 1, 8, 6, 2, 9]

It does seem to find your “random” criteria, is shorter and uses
less calls…

Actually it’s more calls.

You generate a random number for every comparison of elements and do
math on them each time you do (additional calls). The sort_by()
method (and my longhand version) generates just one set of random
numbers up front, one for each element.

Because of this, they are significantly faster on moderate-to-large
size datasets:

#!/usr/bin/env ruby -w

require “benchmark”

data = Array.new(100) { rand(100) }

TESTS = 10_000
Benchmark.bmbm do |results|
results.report(“sort:”) { TESTS.times { data.sort { rand - 0.5 } } }
results.report(“map->sort->map:”) do
TESTS.times do
data.map { |e| [rand, e] }.sort.map { |arr| arr.last }
end
end
results.report(“sort_by:”) { TESTS.times { data.sort_by { rand } } }
end

>> Rehearsal ---------------------------------------------------

>> sort: 6.290000 0.000000 6.290000 ( 6.315368)

>> map->sort->map: 2.800000 0.000000 2.800000 ( 2.800905)

>> sort_by: 1.490000 0.000000 1.490000 ( 1.485774)

>> ----------------------------------------- total: 10.580000sec

>>

>> user system total real

>> sort: 6.330000 0.000000 6.330000 ( 6.346076)

>> map->sort->map: 2.820000 0.000000 2.820000 ( 2.825896)

>> sort_by: 1.490000 0.000000 1.490000 ( 1.485223)

END

James Edward G. II

Benedikt H. schrieb:

=> [10, 6, 1, 7, 3, 8, 5, 9, 4, 2]

rand() returns a float between 0.0 and 1.0. Ruby considers anything

(which puts your random numbers from their range of 0.0-1.0 to a new
=> [9, 6, 2, 5, 1, 4, 3, 8, 7, 10]

It does seem to find your “random” criteria, is shorter and uses less
calls…

Apart from being slower, it is also not producing a random permutation.
The results are skewed due to the way the sorting works.
Try this:

counter = Hash.new(0)
ar = [1, 2, 3]
10000.times {counter[ar.sort{rand-0.5}] += 1}
p counter

On my system, this just produced

{[2, 3, 1]=>1200, [2, 1, 3]=>1230, [3, 2, 1]=>2530,
[1, 2, 3]=>2533, [3, 1, 2]=>1264, [1, 3, 2]=>1243}

HTH,

Michael


Michael U.
R&D Team
ISIS Information Systems Austria
tel: +43 2236 27551-542, fax: +43 2236 21081
e-mail: [email protected]
Visit our Website: www.isis-papyrus.com