I translated a method to do this from PHP earlier this year:
http://tinyurl.com/q8hlg [Google G.]
Here’s a simpler version (hard-coded for UTF-8; it would need some
tweaking for other encodings). It has a side effect of transliterating
punctuation to ASCII as well, which may or may not be desirable.
Paul
$KCODE = ‘u’
require ‘iconv’
class String
def strip_diacritics
self.gsub(/[^\x20-\x7f]/){
Iconv.iconv(‘us-ascii//IGNORE//TRANSLIT’, ‘utf-8’,
$&)[0].sub(/^^`'"~/i, ‘’)
}
end
end
require ‘test/unit’
class TestStripDiacritics < Test::Unit::TestCase
def test_upper_case
assert_equal(‘AAAAA’, ‘ÀÁÂÃÄ’.strip_diacritics)
assert_equal(‘EEEE’, ‘ÈÉÊË’.strip_diacritics)
assert_equal(‘IIII’, ‘ÌÍÎÏ’.strip_diacritics)
assert_equal(‘OOOOO’, ‘ÒÓÔÕÖ’.strip_diacritics)
assert_equal(‘UUUU’, ‘ÙÚÛÜ’.strip_diacritics)
assert_equal(‘Y’, ‘Ý’.strip_diacritics)
assert_equal(‘N’, ‘Ñ’.strip_diacritics)
end
def test_lower_case
assert_equal(‘aaaaa’, ‘âãäàá’.strip_diacritics)
assert_equal(‘eeee’, ‘êëèé’.strip_diacritics)
assert_equal(‘iiii’, ‘îïìí’.strip_diacritics)
assert_equal(‘ooooo’, ‘ôõöòó’.strip_diacritics)
assert_equal(‘uuuu’, ‘ûüùú’.strip_diacritics)
assert_equal(‘y’, ‘ý’.strip_diacritics)
assert_equal(‘n’, ‘ñ’.strip_diacritics)
end
def test_words
assert_equal(‘Internationalizaetion’,
‘Iñtërnâtiônàlizætiøn’.strip_diacritics)
end
def test_punctuation
assert_equal(‘-’, ‘?’.strip_diacritics)
assert_equal(“‘’”, “‘’”.strip_diacritics)
end
end