On 25/05/06, Wes G. email@example.com wrote:
I don’t understand how to make my text, which now contains UTF-8
characters, display correctly in say, Notepad. All of the entities are
preceded by the character A-circumflex. My guess is that Notepad
doesn’t know how to handle UTF-8, for example.
Windows Notepad does handle UTF-8, but requires the presence of a BOM
(the three bytes “\xef\xbb\xbf”) at the start of the file to read it
properly. Other, more competent applications may allow you to select
the appropriate encoding, or may even automatically detect it.
OK I have found the iconv library, however, I am still having trouble.
What is the default text encoding for Ruby? I assume it’s gotten from
the OS, right? So if I’m on Windows XP in the US, it’s probably
Windows XP uses UTF-16 internally, I believe, but retains the concept
of a legacy code page to allow non-Unicode-aware applications to run.
English Windows uses Windows-1252 (mostly, but not completely the same
as ISO-8859-1). Ruby on Windows uses the legacy code page to
communicate with the operating system, so things like file names will
be in Windows-1252.