Forum: Ruby Help with Readlines error

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
Bucco (Guest)
on 2007-02-09 01:31
(Received via mailing list)
I don't know if it is me or if something is wrong with Ruby.  I am
using ruby 1.8.5 (2006-08-25) [i386-mswin32] on an XP box.  I am
following alon with some examples for readline in my new "Everyday
Scripting with Ruby" book.  The book has me in irb and is explaining
File.open('somefile.txt').readlines. I am not getting the same
output.  My output has each word as a member of the array, but each
character of each word has a \000 in front of the character. for
example:

irb(main):002:0> new_inventory = File.open('new-
inventory.txt').readlines
=> ["\377\376e\000x\000e\000r\000c\000i\000s\000e\000-\000d\000i\000f
\000f\000e\
000r\000e\000n\000c\000e\000s\000.\000r\000b\000\r\000\n", "\000i\000n
\000v\000e
\000n\000t\000o\000r\000y\000.\000r\000b\000\r\000\n", "\000n\000e\000w
\000-\000
i\000n\000v\000e\000n\000t\000o\000r\000y\000.\000t\000x\000t\000\r
\000\n", "\00
0o\000l\000d\000-\000i\000n\000v\000e\000n\000t\000o\000r\000y
\000.\000t\000x\00
0t\000\r\000\n", "\000r\000e\000c\000y\000c\000l\000e\000r\000\r
\000\n", "\000r\
000e\000c\000y\000c\000l\000e\000r\000/\000i\000n\000s\000t\000-
\0003\0009\000.\
000t\000m\000p\000\r\000\n", "\000s\000n\000a\000p\000s\000h\000o\000t
\000s\000\
r\000\n", "\000s\000n\000a\000p\000s\000h\000o\000t\000s\000/\000d\000i
\000f\000
f\000e\000r\000e\000n\000c\000e\000s\000-\000v\000e\000r\000s\000i\000o
\000n\000
-\0001\000.\000r\000b\000\r\000\n", "\000s\000n\000a\000p\000s\000h
\000o\000t\00
0s\000/\000d\000i\000f\000f\000e\000r\000e\000n\000c\000e\000s\000-
\000v\000e\00
0r\000s\000i\000o\000n\000-\0002\000.\000r\000b\000\r\000\n", "\000s
\000n\000a\0
00p\000s\000h\000o\000t\000s\000/\000d\000i\000f\000f\000e\000r\000e
\000n\000c\0
00e\000s\000-\000v\000e\000r\000s\000i\000o\000n\000-\0003\000.\000r
\000b\000\r\
000\n", "\000s\000n\000a\000p\000s\000h\000o\000t\000s\000/\000d\000i
\000f\000f\
000e\000r\000e\000n\000c\000e\000s\000-\000v\000e\000r\000s\000i\000o
\000n\000-\
0004\000.\000r\000b\000\r\000\n", "\000s\000n\000a\000p\000s\000h\000o
\000t\000s
\000/\000d\000i\000f\000f\000e\000r\000e\000n\000c\000e\000s\000-\000v
\000e\000r
\000s\000i\000o\000n\000-\0005\000.\000r\000b\000\r\000\n", "\000s\000n
\000a\000
p\000s\000h\000o\000t\000s\000/\000d\000i\000f\000f\000e\000r\000e\000n
\000c\000
e\000s\000-\000v\000e\000r\000s\000i\000o\000n\000-\0006\000.\000r\000b
\000\r\00
0\n", "\000s\000n\000a\000p\000s\000h\000o\000t\000s\000/\000d\000i
\000f\000f\00
0e\000r\000e\000n\000c\000e\000s\000-\000v\000e\000r\000s\000i\000o
\000n\000-\00
07\000.\000r\000b\000\r\000\n", "\000s\000n\000a\000p\000s\000h\000o
\000t\000s\0
00/\000d\000i\000f\000f\000e\000r\000e\000n\000c\000e\000s\000-\000v
\000e\000r\0
00s\000i\000o\000n\000-\0008\000.\000r\000b\000\r\000\n", "\000t\000e
\000m\000p\
000\r\000\n", "\000t\000e\000m\000p\000/\000i\000n\000s\000t\000-
\0003\0009\000\
r\000\n", "\000"]

Any ideas what is wrong?

Thanks:)

SA
Kyle S. (Guest)
on 2007-02-09 01:37
(Received via mailing list)
Well I could be crazy, but that looks like unicode to me.

--Kyle
Mike S. (Guest)
on 2007-02-09 20:41
(Received via mailing list)
Hi!
There is nothing wrong: you are reading the data in Unicode, namely in
UCS2 - fixed 2-byte representation of string characters.
Here's my test program:
= = =
    data = File.open('unicode.txt').readlines
    p data
    puts data
= = =
and its output:

["\377\376T\000h\000i\000s\000 \000i\000s\000 \000a\000
\000l\000i\000n\000e\000 \000i\000n\000
\000U\000n\000i\000c\000o\000d\000e\000
\000(\000U\000S\000C\0002\000)\000.\000"]
  This is a line in Unicode (USC2).

Regards -
Mike S.

Bucco wrote:
> inventory.txt').readlines
> => ["\377\376e\000x\000e\000r\000c\000i\000s\000e\000-\000d\000i\000f
> \000f\000e\
>
...
This topic is locked and can not be replied to.