Splitting a multirecord per file format to a single record p

I’m trying to write essentially what I guess you’d call a filter (or
maybe not
quite exactly). It needs to:

  • read multi-line records from a file (one record at a time)
  • then, with that one record:
    • prepend some additional lines
    • make substitutions for some of the lines already in the record
    • grab some other portions of the record (less than a line, but
      usually
      multiple words), find the “non-null” pieces, and incorporate those in
      another header line
    • create a unique filename
    • write that (single) record to that file

I got started (maybe) by finding a likely looking piece of code in the
Ruby
Cookbook, and tried to modify it to fit my situation:

open(’/rhk/work/ask_notes/politics.twk’) { |f|
f.each(’\x80\x81\x82\x83’) { |
record| p record } }

At this point, I’m stuck, and need some clues to move forward. (In
addition,
I have a few not completely essential to understand questions, below.)

I think the next step is, within the code block / continuation (is that
(or
one of those) the right name?), to slurp the entire record into a
string,
prepend the additional lines, do the substitutions, …, and finally
write a
single record to the new filename.

Main Question:

Am I on the right track, or must I take some different approach to be
able to
process the content of a single record at a time? (I mean, I did a
little
experiment (possibly a bad experiment :wink: like this:

rec_num = 0

open(’/rhk/work/ask_notes/politics.twk’) { |f|
f.each(’\x80\x81\x82\x83’) { |
record| rec_num = rec_num + 1 } }

p rec_num

It only counts to one–instead of 70 to reflect the 70 records I know
are in
that particular file (and which are all printed out with the earlier
version
which has the line “{ |record| p record }”).

Other questions: (I could start a thread for each, but I’ll start this
way and
split them up if I either get too much or not enough response :wink:

  1. What is the right name for that construction: is that a
    continuation, a
    (code?) block, or something else. (Is it possibly that Ruby calls this
    a
    code block and some other languages call it a continuation, or it is an
    example of one kind of continuation available in Ruby?)

  2. What’s the story on white space in that kind of structure. I
    experimented
    with trying to format it to make it (possibly) easier to read, something
    like
    this:

open(’/rhk/work/ask_notes/politics.twk’) {
|f| f.each(’\x80\x81\x82\x83’) {
|record| p record

     <anticipated location of code to process a single record>

}
}

But any whitespace (i.e., newlines) that I added just caused syntax
errors.
Is there a way to “prettyformat” that structure?

  1. The content of the files I have to convert is actually more like
    this:
Record header ('\x80\x81\x82\x83')

Record (with blank lines)
(trailing blank line)
Record header (’\x80\x81\x82\x83’)

Record (with blank lines)
(trailing blank line)
Record header (’\x80\x81\x82\x83’)

Record (with blank lines)

The Ruby code that I copied from the Ruby Cookbook is more aimed at
separating
records that end with a record separator (instead of starting with a
record
header). I can work this way–I mean, worst case I modify every input
file
to do something like remove the first record header from the file and
add a
record header at the end of the file, but that’s probably not really
necessary.

But, it seems like I’m using not quite the right tool. Is there a
better
approach that more exactly fits the format of my files?

Thanks!
Randy K.

On 12.01.2007 15:02, Randy K. wrote:

  * create a unique filename
  * write that (single) record to that file

[…]

(trailing blank line)
Record header (’\x80\x81\x82\x83’)

Record (with blank lines)

You could do:

create a class for your records or use OpenStruct

YourRecord = Struct.new :name, :length, :foo, :bar
def dump()
File.open(file_name, “w”) do |io|
# whatever
end
end
end

current = nil

File.foreach(‘your file’) do |line|
line.chomp!

case line
when /^$/
current = YourRecord.new
when /^$/
current.dump
current = nil
when /Record header/

else
# ignore or whatever
end
end

Kind regards

robert

On Friday 12 January 2007 09:15 am, Robert K. wrote:

On 12.01.2007 15:02, Randy K. wrote:

I’m trying to write essentially what I guess you’d call a filter (or
maybe not quite exactly). It needs to:

You could do:

Thanks–that will get me started!

Randy K.