Forum: Ruby Writing accented characters into HTML files?

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
Kenneth McDonald (Guest)
on 2009-01-05 21:59
(Received via mailing list)
I'm having trouble when I write accented characters into HTML files;
though the accents appear properly in my terminal, they are badly
"messed up" in the HTML output. CGI.escape doesn't fix the problem,
because these are not "special" characters line < or >, but simply
accented e's, o's, etc. I'm assuming the problem has something to do
with a character set type mismatch between the file Ruby is writing
and what the browser (Firefox) expects, but I'm at a loss as to how to
correct it.

Any advice most appreciated,
Thanks,
Ken
Gerald M. (Guest)
on 2009-01-06 01:05
(Received via mailing list)
Look into using a reference.  Valid references are dependent on the
version of the HTML used.
http://en.wikipedia.org/wiki/Character_encodings_in_HTML
http://en.wikipedia.org/wiki/List_of_XML_and_HTML_...

Gerald
Brian C. (Guest)
on 2009-01-06 11:19
Kenneth McDonald wrote:
> Any advice most appreciated,

Use hexdump -C on the file to see what the actual byte sequences are. If
these are single-byte characters then it's probably ISO-8859-1. If they
are two bytes then it's probably UTF-8.

You can use an XML declaration and/or a <meta> tag in the <head> section
to tell the browser which character set your document is in, and/or get
your web server to set the correct charset in the Content-Type header.
marc (Guest)
on 2009-01-06 15:39
(Received via mailing list)
Kenneth McDonald said...
> I'm having trouble when I write accented characters into HTML files;
> though the accents appear properly in my terminal, they are badly
> "messed up" in the HTML output. CGI.escape doesn't fix the problem,
> because these are not "special" characters line < or >, but simply
> accented e's, o's, etc. I'm assuming the problem has something to do
> with a character set type mismatch between the file Ruby is writing
> and what the browser (Firefox) expects, but I'm at a loss as to how to
> correct it.
>
> Any advice most appreciated,

Start by ensuring that you have the following at the top of <head>

  <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

Also, post the "messed up" characters; they'll tell us something about
the encoding problem.

Oh, and make sure your editor is writing utf-8.
James G. (Guest)
on 2009-01-06 18:28
(Received via mailing list)
On Jan 6, 2009, at 3:20 AM, Brian C. wrote:

> Kenneth McDonald wrote:
>> Any advice most appreciated,
>
> Use hexdump -C on the file to see what the actual byte sequences
> are. If
> these are single-byte characters then it's probably ISO-8859-1. If
> they
> are two bytes then it's probably UTF-8.

I have some code that detects valid UTF-8 data here:

http://blog.grayproductions.net/articles/the_unico...

James Edward G. II
This topic is locked and can not be replied to.