Transform non-english text

Hello,
I have a Ruby aplication that deals with non-english text and I want to
transform some of that text to [^a-zA-Z0-9].
Examples:
búsqueda -> busqueda
presenças -> presencas
für -> fur
avião1 -> aviao1

Call anyone help me?

Thanks.
Best regards,
Migrate

Le 19 janvier 2007 à 10:19, Hu Ma a écrit :

Hello,
I have a Ruby aplication that deals with non-english text and I want to
transform some of that text to [^a-zA-Z0-9].

You could try with Iconv to convert from your encoding to ASCII. Quick
example :

require “iconv”
=> true

Iconv.iconv(“ascii//translit”, “iso-8859-1”, “aéioù”)
=> [“a’eio`u”]

Iconv.iconv(“ascii//translit”, “iso-8859-1”, “aéiou”)[0].tr(’^a-z’, ‘’)
=> “aeiou”

Fred

I don’t think there is a unified mapping table to transform
non-[^a-zA-Z0-9]
characters into a specific one of them. But if you can concider to write
a
map yourself try something like:

class String
MAP = [[/ü/, ‘u’],
[/ö/, ‘o’]]

def eng_char
res = String.new(self)
MAP.each { |r| res = res.gsub(r[0],r[1]) }
return res
end

end

s = “abücüöö”
puts s + " => " + s.eng_char


Will output:

abücüöö => abucuoo

Martin

F. Senault wrote:

require “iconv”
=> true

Iconv.iconv(“ascii//translit”, “iso-8859-1”, “aéioù”)
=> [“a’eio`u”]

Iconv.iconv(“ascii//translit”, “iso-8859-1”, “aéiou”)[0].tr(’^a-z’, ‘’)
=> “aeiou”

iconv translit is really nice… when it works. It works on our FreeBSD
server but not on my ubuntu dev machine. Your mileage may vary.

Daniel

Hello,

Thanks for your help.

I will try both approaches to see what fits best.

Best regards,
Migrate

Martin B. wrote:

I don’t think there is a unified mapping table to transform
non-[^a-zA-Z0-9]
characters into a specific one of them. But if you can concider to write
a
map yourself try something like:

class String
MAP = [[/ü/, ‘u’],
[/ö/, ‘o’]]

def eng_char
res = String.new(self)
MAP.each { |r| res = res.gsub(r[0],r[1]) }
return res
end

end

s = “abücüöö”
puts s + " => " + s.eng_char


Will output:

abücüöö => abucuoo

Martin