Mac OS Roma to UTF-8 (Kconv | Iconv]

i’ve a small test using rubyaeosa-0.2.3, works great if i let the output
“as is”.

In that case, according to SubEthaEdit (a MacOS X editor) the output
string is encoded in MacOS Roman.

Because i’ll use the output in an xml form i need to translate to UTF-8.

if i make use of Kconv#toutf8 i get japanese (presumably) chars :

René (true char “é” output “as is” = MacOS Roman)
Ren (if i use Kconv#toutf8)
Anaïs (true char “é” output “as is” = MacOS Roman)
"A"n"a<ï and s replaced by a “japanese” char> (use of Kconv#toutf8)

also if i make use of Iconv.new(‘MACROMAN’, ‘UTF-8’).iconv(str)

i get an error message :
AddressBook2vCardXml.rb:32:in `iconv’: “\216” (Iconv::IllegalSequence)

for the first accentuated string (the “é” of René).

here is my script :

require ‘osx/aeosa’
require ‘kconv’
require ‘iconv’

def album_list
result = OSX.do_osascript %{
tell application “Address Book”
set a to first name of people
set b to last name of people
{a,b}
end tell
}
firstName = result[0].map {|i| i.to_rbobj }
lastName = result[1].map {|i| i.to_rbobj }
return firstName.map {|i| [ i,lastName.shift ] }
end

aFile = File.new(“AddressBook.xml”, “w”)
album_list.each do |f,l|
aFile.puts “#{f} #{l}” // output “as is”

aFile.puts “#{f.toutf8} #{l.toutf8}” // use Kconv#toutf8

fu = Iconv.new(‘MACROMAN’, ‘UTF-8’).iconv(f) // use of Iconv

lu = Iconv.new(‘MACROMAN’, ‘UTF-8’).iconv(l) // use of Iconv

aFile.puts “#{fu} #{lu}” // use of Iconv

end

notice also that, if i do the encoding conversion using command line by
:

iconv -f MACROMAN -t UTF-8 AddressBook.xml > AddressBook-UTF-8.xml

“AddressBook.xml” being the output of my Ruby script, i get
“AddressBook-UTF-8.xml” correctly encoded !!!

may be that’s the only solution for the time being ?

Because i’ll use the output in an xml form i need to translate to UTF-8.

Not necessarily. Just make sure that the encoding is specified in your
XML prolog:

<?xml version="1.0" encoding="Shift_JIS" ?>

also if i make use of Iconv.new(‘MACROMAN’, ‘UTF-8’).iconv(str)

i get an error message :
AddressBook2vCardXml.rb:32:in `iconv’: “\216” (Iconv::IllegalSequence)

for the first accentuated string (the “é” of René).

That’s because the parameters are in the wrong order. They should be
given as (to, from). Your example is therefore trying to convert
from UTF-8 to Mac Roman, which is why the é is illegal.

Try this instead:

utf8_str = Iconv.new(‘UTF-8’, ‘MacRoman’).iconv(mac_str)

e.g.
$KCODE = ‘u’
require ‘iconv’
Iconv.new(‘UTF-8’, ‘MacRoman’).iconv(“Ren\216”) # => “René” [in UTF-8]

Paul.

Paul B. [email protected] wrote:

That’s because the parameters are in the wrong order.

ok, thanks very much, that’s working right nox !!!

[email protected] (Une bévue) writes:

also if i make use of Iconv.new(‘MACROMAN’, ‘UTF-8’).iconv(str)

You got the encodings in the wrong order here. It’s TO, FROM.

Christian N. [email protected] wrote:

You got the encodings in the wrong order here. It’s TO, FROM.

you’re right thanks )))