I’m trying to write essentially what I guess you’d call a filter (or
maybe not
quite exactly). It needs to:
- read multi-line records from a file (one record at a time)
- then, with that one record:
- prepend some additional lines
- make substitutions for some of the lines already in the record
- grab some other portions of the record (less than a line, but
usually
multiple words), find the “non-null” pieces, and incorporate those in
another header line - create a unique filename
- write that (single) record to that file
I got started (maybe) by finding a likely looking piece of code in the
Ruby
Cookbook, and tried to modify it to fit my situation:
open(’/rhk/work/ask_notes/politics.twk’) { |f|
f.each(’\x80\x81\x82\x83’) { |
record| p record } }
At this point, I’m stuck, and need some clues to move forward. (In
addition,
I have a few not completely essential to understand questions, below.)
I think the next step is, within the code block / continuation (is that
(or
one of those) the right name?), to slurp the entire record into a
string,
prepend the additional lines, do the substitutions, …, and finally
write a
single record to the new filename.
Main Question:
Am I on the right track, or must I take some different approach to be
able to
process the content of a single record at a time? (I mean, I did a
little
experiment (possibly a bad experiment like this:
rec_num = 0
open(’/rhk/work/ask_notes/politics.twk’) { |f|
f.each(’\x80\x81\x82\x83’) { |
record| rec_num = rec_num + 1 } }
p rec_num
It only counts to one–instead of 70 to reflect the 70 records I know
are in
that particular file (and which are all printed out with the earlier
version
which has the line “{ |record| p record }”).
Other questions: (I could start a thread for each, but I’ll start this
way and
split them up if I either get too much or not enough response
-
What is the right name for that construction: is that a
continuation, a
(code?) block, or something else. (Is it possibly that Ruby calls this
a
code block and some other languages call it a continuation, or it is an
example of one kind of continuation available in Ruby?) -
What’s the story on white space in that kind of structure. I
experimented
with trying to format it to make it (possibly) easier to read, something
like
this:
open(’/rhk/work/ask_notes/politics.twk’) {
|f| f.each(’\x80\x81\x82\x83’) {
|record| p record
<anticipated location of code to process a single record>
}
}
But any whitespace (i.e., newlines) that I added just caused syntax
errors.
Is there a way to “prettyformat” that structure?
- The content of the files I have to convert is actually more like
this:
Record (with blank lines)
(trailing blank line)
Record header (’\x80\x81\x82\x83’)
Record (with blank lines)
(trailing blank line)
Record header (’\x80\x81\x82\x83’)
Record (with blank lines)
The Ruby code that I copied from the Ruby Cookbook is more aimed at
separating
records that end with a record separator (instead of starting with a
record
header). I can work this way–I mean, worst case I modify every input
file
to do something like remove the first record header from the file and
add a
record header at the end of the file, but that’s probably not really
necessary.
But, it seems like I’m using not quite the right tool. Is there a
better
approach that more exactly fits the format of my files?
Thanks!
Randy K.