I tried to “force” it with Iconv.conv(‘UTF-8’, ‘ASCII’, ‘aeiou’) to no
avail. Any ideas?
– fxn
Hi,
My guess is that the “tr” method treats its arguments as a string of
bytes. And because characters with accents need more than 1 byte in
UTF-8, #tr doesn’t do what you would expect it to. (It’s not even tr’s
fault, how is it supposed to know that two bytes actually represent a
single character?)
The solution is not to use #tr!, but #gsub!. It isn’t as short, but at
least it’s right
norm.gsub!(‘ä’, ‘a’)
norm.gsub!(‘ë’, ‘e’)
and so on…
And because that is against DRY (Don’t Repeat Yourself), I would
recommend storing the mapping as a hash:
accents = { ‘ä’ => ‘a’, ‘ë’ => ‘e’, … }
accents.each do |accent, replacement|
norm.gsub!(accent, replacement)
end
Regards,
Robin S.
This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.