Translating international characters

Hi

I need to convert strings with international characters to strings
with corresponding ASCII codes. For example é, è, ë, and ê (and all
other e-related versions) should convert to e and so on.

Does anyone have a good solution on this?

Kindest regards

Erik

On 13 Apr 2008, at 22:01, Erik L. wrote:

Hi

I need to convert strings with international characters to strings
with corresponding ASCII codes. For example é, è, ë, and ê (and all
other e-related versions) should convert to e and so on.

I once did something very crude, which for your purpose would look
something like this:

 def preprocess(query)
   normalized =  query.chars.normalize :d
   processed = ""
   normalized.u_unpack.each do |c|
     if c >= 0x300 && c < 0x370 #combining marks
     else
       processed << [c].pack('U*')
     end
   processed
 end

Fred

On 13 Apr 2008, at 23:01, Erik L. wrote:

Hi

I need to convert strings with international characters to strings
with corresponding ASCII codes. For example é, è, ë, and ê (and all
other e-related versions) should convert to e and so on.

Does anyone have a good solution on this?

Create a file core_extensions.rb in /lib/ and stick this in:

require ‘iconv’

class String
def to_ascii
Iconv.iconv(“ASCII//IGNORE//TRANSLIT”, “UTF-8”, self).join.sanitize
rescue
self.sanitize
end

def sanitize
self.gsub(/[^a-z._0-9 -]/i, “”).downcase
end
end

Restart your rails server to load the file. Then when you want to
convert the string, you just do something like “Thïs ïs Ã
téststrïng”.to_ascii and it will convert the characters to their ascii
equivalent.

Best regards

Peter De Berdt

Great advice from everybody. I will try these and see how they work.
Thanks.

Erik

convert é, è, ë, and ê … to e, etc…

Try
str = DiacriticsFu::escape(source)
with

file /lib/diacritic_fu.rb :
module DiacriticsFu
def self.escape(str)
ActiveSupport::Multibyte::Handlers::UTF8Handler.
normalize(str,:d).
split(//u).
reject { |e| e.length > 1 }.
join
end
end

, by Thibaut Barrère
(found here :
http://groups.google.ca/group/MephistoBlog/browse_thread/thread/afe817a4a594ddde
there’s even a test suite)

For example, I extended String with
class String

“Un été À la maison”.to_slug(true) == “un-ete-a-la-maison”

def to_slug(force_downcase=false)
str = DiacriticsFu::escape(self)
str.gsub!(/[^a-zA-Z0-9 ]/,"")
str.gsub!(/[ ]+/," “)
str.gsub!(/ /,”-")
force_downcase ? str.downcase : str
end
end

Alain

This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.

| Privacy Policy | Terms of Service | Remote Ruby Jobs