Libxml utf-8 locale


#1

Hello,

honestly if I was to select two things I hate about computers, they
should be XML and UTF-8.

I have an xml file:

“<?xml version='1.0' encoding='UTF-8'?>…”

I have installed libxml-ruby and liblocale-ruby on my etch debian.

I have tried:
export LANG=hu_HU.UTF-8
kate sample.xml
it opens the file correctly.

I have tried
export LANG=hu_HU.UTF-8
my_script.rb sample.xml

It cannot deal with the UTF chars. I also have tried insert this line
into my script (with require ‘locale’ of course):
Locale.setlocale(Locale::LC_ALL, ‘hu_HU.UTF-8’)

No effect.

My script is similar to the one in the docs:

    require 'xml/libxml'
    doc = XML::Document.file('output.xml')
    root = doc.root

    puts "Root element name: #{root.name}"

    elem3 = root.find('elem3').to_a.first
    puts "Elem3: #{elem3['attr']}"

    doc.find('//root_node/foo/bar').each do |node|
      puts "Node path: #{node.path} \t Contents: #{node}"
    end

(I am not using this but something like that with setlocale.)

The output is filled with:
K�­n�¡l

What to do now?

Mage


#2

On Mar 7, 2006, at 2:43 PM, Mage wrote:

line into my script (with require ‘locale’ of course):
puts “Root element name: #{root.name}”

The output is filled with:
Kà nál

What to do now?

Mage

Have you tried putting
$KCODE=u
at the top of your script? (possibly before any requires.)


#3

Logan C. wrote:

“<?xml version='1.0' encoding='UTF-8'?>…”
my_script.rb sample.xml
doc = XML::Document.file(‘output.xml’)

Have you tried putting
$KCODE=u
at the top of your script? (possibly before any requires.)

Didn’t help.

Now I am using iconv converter for some nodes, but I think it’s a nasty
way.

Mage