Short But Unique (#83)

The three rules of Ruby Q.:

  1. Please do not post any solutions or spoiler discussion for this quiz
    until
    48 hours have passed from the time on this message.

  2. Support Ruby Q. by submitting ideas as often as you can:

http://www.rubyquiz.com/

  1. Enjoy!

Suggestion: A [QUIZ] in the subject of emails about the problem helps
everyone
on Ruby T. follow the discussion. Please reply to the original quiz
message,
if you can.

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

by Ryan W.

I use Eclipse (with RadRails!) I have a bunch of files open in tabs.
Once enough
files are open, Eclipse starts to truncate the names so that everything
fits.
It truncates them from the right, which means that pretty soon I’m left
unable
to tell which tab is “users_controller.rb” and which is
“users_controller_test.rb”, because they’re both truncated to
“users_control…”.

The quiz would be to develop an abbrev-like module that shortens a set
of
strings so that they are all within a specified length, and all unique.
You
shorten the strings by replacing a sequence of characters with an
ellipsis
character [U+2026]. If you want it to be ascii-only, use three periods
instead,
but keep in mind that then you can only replace blocks of four or more
characters.

It might look like this in operation:

[‘users_controller’, ‘users_controller_test’,
‘account_controller’, ‘account_controller_test’,
‘bacon’].compress(10)
=> [‘users_c…’, ‘use…test’, ‘account…’, ‘acc…test’, ‘bacon’]

There’s a lot of leeway to vary the algorithm for selecting which
characters to
crop, so extra points go to schemes that yield more readable results.

Two things:

Are the entries in the array always unique?

Or do we have to be able to handle the array such as:

[‘users_controller’, ‘users_controller_test’, ‘account_controller’,
‘account_controller_test’, ‘bacon’, ‘users_controller_test’]

Also is the unicode ellipsis counted as one or three characters?

-Gautam D.

On Jun 17, 2006, at 3:15 PM, Gautam D. wrote:

Two things:

Are the entries in the array always unique?

Let’s assume they are, sure.

Also is the unicode ellipsis counted as one or three characters?

One, in my opinion.

James Edward G. II

My entry is simple, and not very complicated. At first I was thinking
of make it much more complicated and using the abbrev to get human
readable entries for and abbreviated version
of the title. But that seemed to complicated things more then help.
So, I just went for a simple algorithm. My solution basically
consistest of taking a simple truncation of the file name, then if
that is already taken, going to the end and shifting the ellipsis to
the left while reveling more of the last word, till a the title does
not match anymore. There is a very large possibility of getting an
infinite loop. And I have not tested it on many strings. Also,
another flaw is that if two string are identical but smaller then the
value sent to the function, it will return both string untouched.
Since it does not touch any string smaller or equal to the length
passed to it.

Gautam.


#!/usr/bin/env ruby -w

Suggestion: A [QUIZ] in the subject of emails about the problem

helps everyone

on Ruby T. follow the discussion. Please reply to the original

quiz message,

if you can.

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

=-=-=-=-=-=-=

by Ryan W.

I use Eclipse (with RadRails!) I have a bunch of files open in

tabs. Once enough

files are open, Eclipse starts to truncate the names so that

everything fits.

It truncates them from the right, which means that pretty soon I’m

left unable

to tell which tab is “users_controller.rb” and which is

“users_controller_test.rb”, because they’re both truncated to

“users_control…”.

The quiz would be to develop an abbrev-like module that shortens a

set of

strings so that they are all within a specified length, and all

unique. You

shorten the strings by replacing a sequence of characters with an

ellipsis

character [U+2026]. If you want it to be ascii-only, use three

periods instead,

but keep in mind that then you can only replace blocks of four or

more

characters.

It might look like this in operation:

[‘users_controller’, ‘users_controller_test’,

‘account_controller’, ‘account_controller_test’,

‘bacon’].compress(10)

=> [‘users_c…’, ‘use…test’, ‘account…’, ‘acc…test’,

‘bacon’]

There’s a lot of leeway to vary the algorithm for selecting which

characters to

crop, so extra points go to schemes that yield more readable results.

This code is released under the GPL.

require ‘Abbrev’
module GDCompress
def compress (size)
usedNameHash = Hash.new
compressedTitleNames = Array.new
for tabTitle in self
newTabTitle = “” # start with empty string.
if tabTitle.length > size
caseValue = 0
loop do
newTabTitle = tabTitle[0,size-(1+caseValue)] + “?” +
tabTitle[-caseValue,caseValue]
#print “\t#{newTabTitle} is the new tabTitleTitle for #
{tabTitle}\n”
caseValue = caseValue + 3
break unless usedNameHash[newTabTitle]
end
else
newTabTitle = tabTitle
end
usedNameHash[newTabTitle] = tabTitle
compressedTitleNames[compressedTitleNames.length] = newTabTitle
end
compressedTitleNames
end
end

class Array
include GDCompress
extend GDCompress
end

print [‘users_controller’, ‘users_controller_test’,
‘account_controller’, ‘account_controller_test’,
‘bacon’].compress(10)



Here is my solution.
It tries to generate unambiguous abbrevations, if those don’t exist,
it uses the least ambiguous one and always avoids using the same
abbrevation twice.
There’s also a readability thing built in, strings with many
characters at the beginning or having the characters split equally
over the beginning and ending parts are considered the most readable.

class String
def compress(total_length, end_length)
self[0…total_length-end_length] + ‘…’ +
self[length-end_length…-1]
end
end

class Array
def compress!(max_length)
max_length = 4 if max_length < 4
score = Hash.new(0)
usable_length = max_length - 3
order = (0…usable_length).sort_by{|len|
[(len-usable_length.to_f/2).abs,len].min}
to_compress = select {|s| s.length > usable_length}
to_compress.each {|s| order.map{|l| score[s.compress(usable_length,l)]
+= 1 } }
to_compress.each{|s|
s.replace order.map{|l| s.compress(usable_length,l) }.min{|a,b|
score[a] <=> score[b]}
score[s] += 100
}
self
end
end

if FILE==$0
p [‘users_controller’, ‘users_controller_test’,‘account_controller’,
‘account_controller_test’,‘bacon’].compress!(10)
p Array.new(10){‘abcdefghijklmnopqrstuvwxyz’}.compress!(12)
p [‘aaaaaazbbbbb’,‘aaaaaaybbbbb’].compress!(9)
end