Saving an UTF-8 file

Hi All

I have a problem (newbie problem).

I don’t know how to write a file using utf-8 encoding. Can you help
me.

Thanks in advance

Kind regards

Miquel (a.k.a. Ton)
Linux User #286784
GPG Key : 4D91EF7F
Debian GNU/Linux (Linux Wolverine 2.6.14)

Welcome to the jungle, we got fun and games
Guns n’ Roses


LLama Gratis a cualquier PC del Mundo.
Llamadas a fijos y móviles desde 1 céntimo por minuto.
http://es.voice.yahoo.com

Miquel O. wrote:

Hi All

I have a problem (newbie problem).

I don’t know how to write a file using utf-8 encoding. Can you help
me.

utf-8 is simply 8-bit bytes. Save your data like this:

“data” contain the text data

File.open(file_path,“w”) { |f| f.write data }

utf-8 refers to a convention regarding the content of the bytes and how
they
are interpreted when read. It isn’t something you can specify in a
plain-text file. It can be inferred from the format of the bytes, but
that
is an open interpretation.

http://dict.die.net/utf-8/

Paul L. wrote:

It isn’t something you can specify in a
plain-text file.

Byte order mark?

A specification it is not, but generally a good hint. There are gotchas
though if you process it with software that’s not Unicode-unaware.

David V.

On 11/12/06, David V. [email protected] wrote:

Paul L. wrote:

It isn’t something you can specify in a
plain-text file.
Byte order mark?

Not meaningful in UTF-8, since it’s all a defined series of bytes
(it’s always the same order on all platforms).

-austin

Austin Z. wrote:

On 11/12/06, David V. [email protected] wrote:

Paul L. wrote:

It isn’t something you can specify in a
plain-text file.
Byte order mark?

Not meaningful in UTF-8, since it’s all a defined series of bytes
(it’s always the same order on all platforms).

-austin

Yes, but it can be used as a “this file is UTF-8” marker by convention.
And cause problems in software that doesn’t recognize the convention,
for added hilarity.

David V.

On 11/12/06, David V. [email protected] wrote:

for added hilarity.
It’s a bad convention, because it adds meaningless bytes to the
beginning of a file. I’m not saying that an unadorned document is
better, but better to do something that has actual meaning than doing
a pointless BOM.

-austin

On 11/12/06, Miquel O. [email protected] wrote:

Hi All

I have a problem (newbie problem).

I don’t know how to write a file using utf-8 encoding. Can you help
me.

Well, how are you storing the Unicode characters are you using
internally? If your Unicode string within Ruby is stored as an array
of ints, then

File.open(“output_file.utf8”) do |fp|
fp.puts(data.pack(“U*”))
end

should be sufficient. If you have a Ruby string that uses some other
encoding (e.g. ISO-8859-1), then you must use the iconv library to
convert the string to UTF-8:

require ‘iconv’

cd = Iconv.new(‘utf-8’, ‘iso-8859-1’)
File.open(“output_file.utf8”) do |fp|
fp.puts(cd.iconv(data))
end

When you do i18n, l10n, and m17n, strings become meaningless unless
they have an attached encoding.