String#downcase for accented letters in UTF-8?


#1

How can I convert to lowercase a string that contains accented
characters
encoded in UTF-8?
String#downcase does not work with accented letters.


#2

On Apr 15, 2006, at 15:08, Gioele B. wrote:

How can I convert to lowercase a string that contains accented
characters
encoded in UTF-8?
String#downcase does not work with accented letters.

I hand-coded that:

This library redefines String#tr so that it understands UTF-8.

require ‘jcode’

def normalize_for_sorting(s)
return nil if s.nil?
norm = s.downcase
norm.tr!(‘ÁÉÍÓÚ’, ‘aeiou’)
norm.tr!(‘ÀÈÌÒÙ’, ‘aeiou’)
norm.tr!(‘ÄËÏÖÜ’, ‘aeiou’)
norm.tr!(‘ÂÊÎÔÛ’, ‘aeiou’)
norm.tr!(‘áéíóú’, ‘aeiou’)
norm.tr!(‘àèìòù’, ‘aeiou’)
norm.tr!(‘äëïöü’, ‘aeiou’)
norm.tr!(‘âêîôû’, ‘aeiou’)
norm
end

– fxn