I am running Ruby 1.86 on Windows, and having trouble reading in some
text files. For some text files, if I do something simple like:
myfile = File.open(“logfile.log”)
contents = myfile.read()
puts contents
I get each character seperated by a space, such as:
”= = = V e r b o s e l o g g i n g s t a r t e d : 1 / 2 8 / 2
0 0 9
1 3 : 4 5 : 0 6 B u i l d t y p e : S H I P U N I C O D E
If I bring up the file in even a bare-bones editor (such as VIM), I
get the file as it normally is (without any extraneous spaces). Does
anyone know why this would be, or how I can work around it? It’s
causing issues as I am trying to write a script to search for a
particular string of text, and obviously it isn’t found, even though
it should be.
Thanks,
Jim
2009/2/2 Jim K. [email protected]:
0 0 9
1 3 : 4 5 : 0 6 B u i l d t y p e : S H I P U N I C O D E
If I bring up the file in even a bare-bones editor (such as VIM), I
get the file as it normally is (without any extraneous spaces). Does
anyone know why this would be, or how I can work around it? It’s
causing issues as I am trying to write a script to search for a
particular string of text, and obviously it isn’t found, even though
it should be.
The file is probably UTF-16 encoded and starts with a BOM.
Try to convert the string to UTF-8, or switch to Ruby 1.9.
Stefan
2009/2/2 Stefan L. [email protected]:
”= = = V e r b o s e l o g g i n g s t a r t e d : 1 / 2 8 / 2
The file is probably UTF-16 encoded and starts with a BOM.
Try to convert the string to UTF-8, or switch to Ruby 1.9.
Sorry, I meant to say “Try to convert the string to UTF-8 WITH Iconv”
Stefan
Thanks…so if I upgraded to Ruby 1.9, would it convert it
automatically?
Thanks for the pointer! I actually ended up using the iconv module,
and it worked like a charm. Incidentally, in case anyone else is
curious about this, Windows .REG files get saved as UTF-16 by default.
2009/2/2 Jim K. [email protected]:
Thanks…so if I upgraded to Ruby 1.9, would it convert it
automatically?
You’d have to tell it that you want to work with UTF-8
internally by putting this at the top of your application:
Encoding.default_internal = Encoding::UTF_8
and then tell the read or open function that the file
is UTF-16 encoded, e.g.:
content = File.read("logfile.log", encoding: "utf-16")
Though I don’t know how many gems already work for
Ruby 1.9.1 on Windows.
Stefan