Help with Readlines error


#1

I don’t know if it is me or if something is wrong with Ruby. I am
using ruby 1.8.5 (2006-08-25) [i386-mswin32] on an XP box. I am
following alon with some examples for readline in my new “Everyday
Scripting with Ruby” book. The book has me in irb and is explaining
File.open(‘somefile.txt’).readlines. I am not getting the same
output. My output has each word as a member of the array, but each
character of each word has a \000 in front of the character. for
example:

irb(main):002:0> new_inventory = File.open(‘new-
inventory.txt’).readlines
=> ["\377\376e\000x\000e\000r\000c\000i\000s\000e\000-\000d\000i\000f
\000f\000e
000r\000e\000n\000c\000e\000s\000.\000r\000b\000\r\000\n", “\000i\000n
\000v\000e
\000n\000t\000o\000r\000y\000.\000r\000b\000\r\000\n”, “\000n\000e\000w
\000-\000
i\000n\000v\000e\000n\000t\000o\000r\000y\000.\000t\000x\000t\000\r
\000\n”, “\00
0o\000l\000d\000-\000i\000n\000v\000e\000n\000t\000o\000r\000y
\000.\000t\000x\00
0t\000\r\000\n”, “\000r\000e\000c\000y\000c\000l\000e\000r\000\r
\000\n”, “\000r
000e\000c\000y\000c\000l\000e\000r\000/\000i\000n\000s\000t\000-
\0003\0009\000.
000t\000m\000p\000\r\000\n”, “\000s\000n\000a\000p\000s\000h\000o\000t
\000s\000
r\000\n”, “\000s\000n\000a\000p\000s\000h\000o\000t\000s\000/\000d\000i
\000f\000
f\000e\000r\000e\000n\000c\000e\000s\000-\000v\000e\000r\000s\000i\000o
\000n\000
-\0001\000.\000r\000b\000\r\000\n”, “\000s\000n\000a\000p\000s\000h
\000o\000t\00
0s\000/\000d\000i\000f\000f\000e\000r\000e\000n\000c\000e\000s\000-
\000v\000e\00
0r\000s\000i\000o\000n\000-\0002\000.\000r\000b\000\r\000\n”, “\000s
\000n\000a\0
00p\000s\000h\000o\000t\000s\000/\000d\000i\000f\000f\000e\000r\000e
\000n\000c\0
00e\000s\000-\000v\000e\000r\000s\000i\000o\000n\000-\0003\000.\000r
\000b\000\r
000\n”, “\000s\000n\000a\000p\000s\000h\000o\000t\000s\000/\000d\000i
\000f\000f
000e\000r\000e\000n\000c\000e\000s\000-\000v\000e\000r\000s\000i\000o
\000n\000-
0004\000.\000r\000b\000\r\000\n”, “\000s\000n\000a\000p\000s\000h\000o
\000t\000s
\000/\000d\000i\000f\000f\000e\000r\000e\000n\000c\000e\000s\000-\000v
\000e\000r
\000s\000i\000o\000n\000-\0005\000.\000r\000b\000\r\000\n”, “\000s\000n
\000a\000
p\000s\000h\000o\000t\000s\000/\000d\000i\000f\000f\000e\000r\000e\000n
\000c\000
e\000s\000-\000v\000e\000r\000s\000i\000o\000n\000-\0006\000.\000r\000b
\000\r\00
0\n”, “\000s\000n\000a\000p\000s\000h\000o\000t\000s\000/\000d\000i
\000f\000f\00
0e\000r\000e\000n\000c\000e\000s\000-\000v\000e\000r\000s\000i\000o
\000n\000-\00
07\000.\000r\000b\000\r\000\n”, “\000s\000n\000a\000p\000s\000h\000o
\000t\000s\0
00/\000d\000i\000f\000f\000e\000r\000e\000n\000c\000e\000s\000-\000v
\000e\000r\0
00s\000i\000o\000n\000-\0008\000.\000r\000b\000\r\000\n”, “\000t\000e
\000m\000p
000\r\000\n”, “\000t\000e\000m\000p\000/\000i\000n\000s\000t\000-
\0003\0009\000
r\000\n”, “\000”]

Any ideas what is wrong?

Thanks:)

SA


#2

Well I could be crazy, but that looks like unicode to me.

–Kyle


#3

Hi!
There is nothing wrong: you are reading the data in Unicode, namely in
UCS2 - fixed 2-byte representation of string characters.
Here’s my test program:
= = =
data = File.open(‘unicode.txt’).readlines
p data
puts data
= = =
and its output:

["\377\376T\000h\000i\000s\000 \000i\000s\000 \000a\000
\000l\000i\000n\000e\000 \000i\000n\000
\000U\000n\000i\000c\000o\000d\000e\000
\000(\000U\000S\000C\0002\000)\000.\000"]
This is a line in Unicode (USC2).

Regards -
Mike S.

Bucco wrote:

inventory.txt’).readlines
=> ["\377\376e\000x\000e\000r\000c\000i\000s\000e\000-\000d\000i\000f
\000f\000e\