Forum: Ruby Mac OS Roma to UTF-8 (Kconv | Iconv]

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
unknown (Guest)
on 2005-12-26 21:38
(Received via mailing list)
i've a small test using rubyaeosa-0.2.3, works great if i let the output
"as is".

In that case, according to SubEthaEdit (a MacOS X editor) the output
string is encoded in MacOS Roman.

Because i'll use the output in an xml form i need to translate to UTF-8.

if i make use of Kconv#toutf8 i get japanese (presumably) chars :

René  (true char "é" output "as is" = MacOS Roman)
Ren   (if i use Kconv#toutf8)
Anaïs (true char "é" output "as is" = MacOS Roman)
"A"n"a<ï and s replaced by a "japanese" char> (use of Kconv#toutf8)

also if i make use of Iconv.new('MACROMAN', 'UTF-8').iconv(str)

i get an error message :
AddressBook2vCardXml.rb:32:in `iconv': "\216" (Iconv::IllegalSequence)

for the first accentuated string (the "é" of René).


here is my script :
<code>
require 'osx/aeosa'
require 'kconv'
require 'iconv'


def album_list
  result = OSX.do_osascript %{
        tell application "Address Book"
          set a to first name of people
          set b to last name of people
         {a,b}
        end tell
      }
  firstName = result[0].map {|i| i.to_rbobj }
  lastName = result[1].map {|i| i.to_rbobj }
  return firstName.map {|i| [ i,lastName.shift ] }
end

aFile = File.new("AddressBook.xml", "w")
album_list.each do |f,l|
  aFile.puts "#{f} #{l}"                         // output "as is"
#  aFile.puts "#{f.toutf8} #{l.toutf8}"          // use Kconv#toutf8
#  fu = Iconv.new('MACROMAN', 'UTF-8').iconv(f)  // use of Iconv
#  lu = Iconv.new('MACROMAN', 'UTF-8').iconv(l)  // use of Iconv
#  aFile.puts "#{fu} #{lu}"                      // use of Iconv
end
</code>

notice also that, if i do the encoding conversion using command line by
:
>iconv -f MACROMAN -t UTF-8 AddressBook.xml > AddressBook-UTF-8.xml

"AddressBook.xml" being the output of my Ruby script, i get
"AddressBook-UTF-8.xml" correctly encoded !!!


may be that's the only solution for the time being ?
Paul B. (Guest)
on 2005-12-27 11:19
(Received via mailing list)
> Because i'll use the output in an xml form i need to translate to UTF-8.

Not necessarily. Just make sure that the encoding is specified in your
XML prolog:

<?xml version="1.0" encoding="Shift_JIS" ?>

> also if i make use of Iconv.new('MACROMAN', 'UTF-8').iconv(str)
>
> i get an error message :
> AddressBook2vCardXml.rb:32:in `iconv': "\216" (Iconv::IllegalSequence)
>
> for the first accentuated string (the "é" of René).

That's because the parameters are in the wrong order. They should be
given as (to, from). Your example is therefore trying to convert
*from* UTF-8 *to* Mac Roman, which is why the é is illegal.

Try this instead:

utf8_str = Iconv.new('UTF-8', 'MacRoman').iconv(mac_str)

e.g.
$KCODE = 'u'
require 'iconv'
Iconv.new('UTF-8', 'MacRoman').iconv("Ren\216") # => "René" [in UTF-8]

Paul.
unknown (Guest)
on 2005-12-27 13:08
(Received via mailing list)
Paul B. <removed_email_address@domain.invalid> wrote:

> That's because the parameters are in the wrong order.

ok, thanks very much, that's working right nox !!!
Christian N. (Guest)
on 2005-12-27 15:50
(Received via mailing list)
removed_email_address@domain.invalid (Une bévue) writes:

> also if i make use of Iconv.new('MACROMAN', 'UTF-8').iconv(str)

You got the encodings in the wrong order here.  It's TO, FROM.
unknown (Guest)
on 2005-12-27 16:00
(Received via mailing list)
Christian N. <removed_email_address@domain.invalid> wrote:

> You got the encodings in the wrong order here.  It's TO, FROM.

you're right thanks )))
This topic is locked and can not be replied to.