ARGF.eof? behavior

Hi folks,

In Ruby 1.8, I know that:

$ ruby -e ‘while !ARGF.eof?; puts ARGF.readline; end’ /tmp/foo /tmp/bar

prints every line in /tmp/foo, but not /tmp/bar. However, in Ruby 1.9:

$ ruby1.9 -e ‘p ARGF.eof?’ /tmp/foo
true

Which means that lines from neither /tmp/foo nor /tmp/bar would be
printed
in the first example. Is this an expected change in behavior? Seems to
be
consistent for both 1.9.1p0 and the 1.9.2 svn trunk I just compiled.

If it is, it’s not that big of a deal, except I’m not sure how to
“switch
ARGF” to the next file without calling ARGF.gets or ARGF.readline.

For example, is there a method to complete the following code, so as to
print lines from all files listed in ARGV?

while !ARGV.empty?
# ARGF.some_method_to_advance_to_next_file

while !ARGF.eof?
  puts ARGF.readline
end

end

I know I can use ARGF.each, gets, or readline. However, I’m really
calling
a parsing method that takes ARGF as an argument and calls readline on my
behalf. I’d like to able to distinguish between EOFErrors due to
reaching
EOF before parsing (no more data records), and EOFErrors due to reaching
EOF during parsing (an incomplete data record).

Thanks!

Hi,

2009/7/24 Mike K. [email protected]:

 while !ARGV.empty?
  # ARGF.some_method_to_advance_to_next_file

  while !ARGF.eof?
   puts ARGF.readline
  end
 end

You can use ARGF.each for both 1.8.x and 1.9.x.

Try
$ ruby -e ‘ARGF.each{|l|puts l}’ /tmp/foo /tmp/bar

Regards,

Park H.

On Fri, Jul 24, 2009 at 11:42:00AM +0900, Heesob P. wrote:

You can use ARGF.each for both 1.8.x and 1.9.x.

Try
$ ruby -e ‘ARGF.each{|l|puts l}’ /tmp/foo /tmp/bar

Right, I understand that this works in this particular example. Perhaps
a more indepth example helps illustrate the problem better.

Presume I have a method, “parse”, that parses data records from an IO
stream. It looks something like this:

def parse(io)
first = io.readline
… # Code to validate first
second = io.readline

third = io.readline

ParsedThing.new(first, second, third)

end

The idea is to call “parse ARGF” only when I know there’s data left in
the stream to be parsed. Otherwise if I get an EOFError its meaning is
ambiguous–there could be an incomplete data record (i.e., could parse
“first” and “second”, but got an EOF while reading “third”), or there
could be no more records in the file.

ARGF.each isn’t going to work since ARGF is being used as an external
iterator by the parse method, and calling ARGF.gets/readline outside the
parse method strips the first line of a record. I’m looking for a
non-destructive file advance operation, if that makes sense.

On Fri, Jul 24, 2009 at 12:20:36PM +0900, Mike K. wrote:

I’m looking for a non-destructive file advance operation, if that makes
sense.

Two things:

  • Turns out Ruby 1.9’s ARGF.eof? behavior was a bug, now fixed in svn
    trunk. A workaround is to call “ARGF.file” (or another ARGF accessor)
    before the while loop.

  • ARGF.skip is the non-destructive file advance operation that I was
    looking for. ARGF.close works too, but can also close $stdin which
    may
    not be preferred. Problem is that neither currently works when
    followed
    by ARGF.eof?. I submitted a patch to fix that to the issue tracker.

Unfortunately this means that the behavior of ARGF with regard to
close/eof/skip changes somewhat between patchlevels on Ruby 1.8.7 &
1.9.1.

To answer my original question (and for the benefit of anyone else
looking
to do something similar), the following code includes the appropriate
workarounds to work on, I believe, all 1.8/1.9 versions:

Print (or whatever) every line from all files listed in ARGV or

$stdin.

loop do
current = ARGF.file

while !ARGF.eof?
  puts ARGF.readline  # Or whatever.
end

ARGF.skip.file == current and break

end