Just got the O’Reilly announcement of the book “Beautiful
Code”… here is the sample chapter by Tim B. on Ruby!
Pity he didn’t run his examples by this forum first, I’m sure we could
have made them more beautiful…
Well, maybe for the 2nd edition, heres my bash at cleaning it up…
For example scanning a log file…
EXAMPLE 4-2 . Printing article names
1 ARGF.each_line do |line|
2 if line =~ %r{GET /ongoing/When/\d\d\dx/(\d\d\d\d/\d\d/\d\d/[^ .]+)
}
3 puts $1
4 end
5 end
Since most Linux distro’s roll the log files to a reasonable size I
would have just gone with…
ARGF.read.scan( %r{GET /ongoing/When/\d\d\dx/(\d\d\d\d/\d\d/\d\d/[^ .]+)
}) do |match|
puts $1
end
In Example 4.3 he has…
1 counts = {}
2 counts.default = 0
Personally I prefer…
counts = Hash.new(0)
or
counts = Hash.new{|hash,key| hash[key]=0}
buts that’s personal preference I guess.
In example 4.4 he rightly identifies line 10 as a little ugly and
bemoans the lack of sort_by_value in hash.
10 keys_by_count = counts.keys.sort { |a, b| counts[b] <=> counts[a]
}
it seems he doesn’t know about…
count.keys.sort_by{|key| count[key]}
Of course he could have done…
class Hash
def sort_keys_by_value
keys.sort_by{|key| fetch(key)}
end
end
Example 4.5 looks like a poster child for the Hash.new block approach…
4 @hash = {}
.
.
9 if @hash[s[0]]
10 @hash[s[0]] << [ s[1], article ]
11 else
12 @hash[s[0]] = [ s[1], article ]
13 end
Replace that with…
4 @hash = Hash.new{|hash,key| hash[key] = []}
10 @hash[s[0]] << [ s[1], article ]
That makes it real pretty code.
In fact, I can’t find any pretty reason to have the class BigHash at
all…looks like his Java roots are showing through.
Personally I’d just drop it.
Partly the point of the exercise here is it is slow because it is so
big, a couple of things I wonder about…
For example I would time how long it takes to just read the file and do
nothing…
He mentions the program grew to 1.56gb
Well, he is creating an Array per log entry.
Maybe if he created two hashes instead of an array per entry.
Incidently, I wouldn’t have chosen that chunk of Java for illustrating
the binary search…
glibc’s implementation is just so clean…
/* Perform a binary search for KEY in BASE which has NMEMB elements
of SIZE bytes each. The comparisons are done by (*COMPAR)(). */
void *
bsearch (const void *key, const void *base, size_t nmemb, size_t size,
int (*compar) (const void *, const void *))
{
size_t l, u, idx;
const void *p;
int comparison;
l = 0;
u = nmemb;
while (l < u)
{
idx = (l + u) / 2;
p = (void *) (((const char *) base) + (idx * size));
comparison = (*compar) (key, p);
if (comparison < 0)
u = idx;
else if (comparison > 0)
l = idx + 1;
else
return (void *) p;
}
return NULL;
}
On the other hand both have the classic bug that (high + low) or (l+u)
overflows on large arrays…
John C. Phone : (64)(3) 358 6639
Tait Electronics Fax : (64)(3) 359 4632
PO Box 1645 Christchurch Email : [email protected]
New Zealand