i have a regexp: /(^BillHead(.))(^Bill_End(.))/m that’s too greedy for
processing
a billing extract file containing:
BillHead…<<>\n
<<one or more detail lines here\n>>
Bill_End…<<>\n
BillHead…<<>\n
<<one or more detail lines here\n>>
Bill_End…<<>\n
…etc… to EOF…
…i get the whole file matched…i just want each invoice…
it will eventually be in a oneliner like
a=File.read(“billfile”).scan(regexp)
so what is the non-greedy way for the above regexp to properly match
each invoice…
On 11/9/06, Dave R. [email protected] wrote:
…i get the whole file matched…i just want each invoice…
it will eventually be in a oneliner like
a=File.read(“billfile”).scan(regexp)
so what is the non-greedy way for the above regexp to properly match
each invoice…
try:
/(^BillHead(.?))(^Bill_End(.?))\n/m
or
/(^BillHead(.?))(^Bill_End([^\n].))\n/m
notice the .? instead of .
.*? has some pecularities, that were discussed here some time ago, so
perhaps you’d want to find them in the archives. (search for ‘greedy’
or ‘regex’ - I don’t remeber now)
Jan S. wrote:
…etc… to EOF…
/(^BillHead(.?))(^Bill_End(.?))\n/m
or
/(^BillHead(.?))(^Bill_End([^\n].))\n/m
notice the .? instead of .
*? has some pecularities, that were discussed here some time ago, so
perhaps you’d want to find them in the archives. (search for ‘greedy’
or ‘regex’ - I don’t remeber now)
I would also remove the last .* because that likely eats up the rest of
the document. So that would be
/^BillHead(.*?)^BillEnd/m
Another approach is to do
s.split(/^(Bill(?:Head|End))/m)
and then go through the array.
irb(main):006:0>
“BillHead\nfoo\nbar\nBillEnd”.split(/^(Bill(?:Head|End))/m)
=> ["", “BillHead”, “\nfoo\nbar\n”, “BillEnd”]
Kind regards
robert
Robert K. wrote:
Jan S. wrote:
…etc… to EOF…
/(^BillHead(.?))(^Bill_End(.?))\n/m
or
/(^BillHead(.?))(^Bill_End([^\n].))\n/m
notice the .? instead of .
=> ["", “BillHead”, “\nfoo\nbar\n”, “BillEnd”]
Kind regards
robert
i played around in irb with a shorten extract file and found that:
b=File.read(“drbilp.txt”).scan(/(^BillHead(.?))(^Bill_End(\d)(\sUBPBILP1\n)(.?))/m)
works in that separates each invoice in an sub-array of size=6
in which b[x][0]+b[x][2] completes that task of reading,scanning
correctly
and puting all in a ruby ‘container’ that i can do an each on…thanx
dave