=begin
Unlike Gawk and Mawk, Ruby won’t accept a regular expression as a
record-separator. Let’s fix that. The substring matched by the
record-separator is automatically removed from the record, but it
can be obtained by RecSep#terminator.
Typical usage:
File.open(“stuff.txt”){|handle|
reader = RecSep.new( handle, /^\d+.\n/ )
reader.each {|x| p x }
}
=end
class RecSep
def initialize( file_handle, record_separator )
@handle = file_handle
@buffer = “”
@rec_sep = record_separator
@terminator = nil
end
def get_rec
## The record-separator may be something like /\n\s*\n/,
## so we read enough to let it match as much as possible.
loop do
@rec_sep.match( @buffer )
break if $~ && $~.post_match.size > 0
s = @handle.gets( “\n” )
break if not s
@buffer << s
end
if $~
@buffer = $~.post_match
@terminator = $~.to_s
$~.pre_match
else
@terminator = nil
return nil if "" == @buffer
s, @buffer = @buffer, ""
s
end
end
def each
while s = self.get_rec
yield s
end
end
def terminator
@terminator
end
end