Lyrics and Chinese in Ruby?

Hello, has anyone used Chinese with Ruby? I’m trying to write a
script that would import/export my lyrics from/to lyricwiki.org.
Since they don’t have a lot of Chinese lyrics, I thought I’d help them
out by exporting my collection. Looking at their SOAP API wiki
(http://lyricwiki.org/LyricWiki:SOAP), they seem have some troubles
encoding and decoding Unicode characters using Ruby’s soap/
wsdlDriver. I tried to run my songs’ meta data through rubyosa. But
I have no luck.

Here’s a few songs I’ve tried out by copy-pasting the artist and song
names into the source and fetching the lyrics:
http://lyricwiki.org/Category:Language/Cantonese.

And here is a simple script, I’ve been testing with (w/ RubyOSA):
http://pastie.caboo.se/141454

My system is running on MacOS Leopard, Ruby 1.8.6. Thanks in advance
for all your help!

David

On 21 Jan, 10:34, NewtonApple [email protected] wrote:

Hello, has anyone used Chinese with Ruby? I’m trying to write a
script that would import/export my lyrics from/to lyricwiki.org.
Since they don’t have a lot of Chinese lyrics, I thought I’d help them
out by exporting my collection. Looking at their SOAP API wiki
(http://lyricwiki.org/LyricWiki:SOAP), they seem have some troubles
encoding and decoding Unicode characters using Ruby’s soap/
wsdlDriver.

Looks like something’s hosed somewhere:

require ‘soap/wsdlDriver’
driver = SOAP::WSDLDriverFactory.new(“http://lyricwiki.org/server.php?
wsdl”).create_rpc_driver

p driver.getSong(“La Mosca Ts\303\251-Ts\303\251”,“Madrid Amaneci
\303\263”).artist

"La Mosca Ts\303\203\302\203\303\202\302\251-Ts

\303\203\302\203\303\202\302\251" (!)

Same problem seems to happen on Python, which suggests the problem
might be on the server side:

import LyricWiki_services

soap = LyricWiki_services.LyricWikiBindingSOAP(‘http://lyricwiki.org/
server.php’)
song = LyricWiki_services.getSongRequest()
song.Artist = unicode(‘La Mosca Ts\xc3\xa9-Ts\xc3\xa9’, ‘utf8’)
song.Song = unicode(‘Madrid Amaneci\xc3\xb3’, ‘utf8’)
result = soap.getSong(song)

print result.Return.Artist.encode('utf8')

'La Mosca Ts\xc3\x83\xc2\x83\xc3\x82\xc2\xa9-Ts

\xc3\x83\xc2\x83\xc3\x82\xc2\xa9’

You might want to speak to the LyricWiki folks about that.

And here is a simple script, I’ve been testing with (w/RubyOSA):Parked at Loopia

Note that this script won’t work as-is for non-English names since
RubyOSA uses ASCII by default, although this can be changed.
Alternatively, use rb-appscript, which uses UTF8 by default.

HTH

has