I have a file with the following format (example):
Save Format v3.0(19990112) @begin Libraries
“felles.pbl” “”; @end; @begin Objects
“n_cst_xml_utils.sru” “felles.pbl”;
“n_melding.sru” “felles.pbl”; @end;
The data in the two begin/end blocks are lists, which may be longer than
shown.
I’d like to extract an array of the filenames (first quote) in the @begin
Objects … @end; block.
For the example above this should return [“n_cst_xml_utils.sru”,
“n_melding.sru”]
My initial idea was to treat the whole thing as one long string, and
extract
the part within the being-end-block by using regexp, converting the
result
back to individual lines (split ‘\n’) and doing array.map and regexp to
single out the name in the first quote on each line.
But I keep hitting the wall, especially with the first step in this
approach… :o(
I know this should be easily done in a couple of lines of code, but I
can’t
get it right.
My initial idea was to treat the whole thing as one long string, and
extract
the part within the being-end-block by using regexp, converting the
result
back to individual lines (split ‘\n’) and doing array.map and regexp to
single out the name in the first quote on each line.
But I keep hitting the wall, especially with the first step in this
approach… :o(
filenames = File.open(filename).readlines.join.scan(/^@begin
Objects\n(.?)^@end;/m)[0][0].split("\n").map{|l|
l.scan(/"(.?)"/)[0][0]}
Probably far from optimal, but it seems to do the trick.
filenames = File.open(filename).readlines.join.scan(/^@begin
Objects\n(.?)^@end;/m)[0][0].split("\n").map{|l|
l.scan(/"(.?)"/)[0][0]}
Probably far from optimal, but it seems to do the trick.
That’s the most important thing
I actually misread your example. If there’s only one @begin Objects
section, then ‘scan’ is overkill; a simple regexp match will do.
res = if File.read(filename) =~ /^@begin Objects$(.?)^@end;$/m
$1.scan(/^\s"(.*?)"/).map { |r| r.first }
end
I actually misread your example. If there’s only one @begin Objects
section, then ‘scan’ is overkill; a simple regexp match will do.
res = if File.read(filename) =~ /^@begin Objects$(.?)^@end;$/m
$1.scan(/^\s"(.*?)"/).map { |r| r.first }
end
If files are large than the line based approach is usually more
feasible. In this case you can use the flip flop operator in an if
condition to select the lines we want:
Robert:
The use of flip flop operator opened a new door for me. Didn’t know of
this
before…
And new knowledge is the best knowledge! :o)
w_a_x_man:
I can’t believe I didn’t think of the possibility to use a simple split
instead of a scan to extract the filenames between the first two
quotation
marks!
Thanks to all for the great input I’ve gotten on this issue!
:o)