A tighter match with regex?

dubstep · June 30, 2011, 7:43am

My regex pattern is too aggressive:

/name="image002.jpg".+Content-Id-: [email protected]/im

I’m trying to isolate a block of text from withing a string. Would like
this:

name=“image002.jpg”
Content-Transfer-Encoding: Base64
Content-Id-: [email protected]

Source string:

-begin

Content-Type: image/jpeg; name=“image001.jpg”
Content-Transfer-Encoding: Base64
Content-Id-: [email protected]

Content-Type: image/jpeg; name=“image002.jpg”
Content-Transfer-Encoding: Base64
Content-Id-: [email protected]

Content-Type: image/jpeg; name=“image001.jpg”
Content-Transfer-Encoding: Base64
Content-Id-: [email protected]

Content-Type: image/jpeg; name=“image002.jpg”
Content-Transfer-Encoding: Base64
Content-Id-: [email protected]

Content-Type: image/jpeg; name=“image006.jpg”
Content-Transfer-Encoding: Base64
Content-Id-: [email protected]

-end

Any help would be greatly appreciated!

gflamino · June 30, 2011, 7:06pm

Gil Flamino wrote in post #1008364:

Any help would be greatly appreciated!

while para = DATA.gets(‘’)
if md = para.match(/
name=“image002.jpg”
.*?
666666>
/xms)
puts md[0]
end
end

END
Content-Type: image/jpeg; name=“image001.jpg”
Content-Transfer-Encoding: Base64
Content-Id-: [email protected]

Content-Type: image/jpeg; name=“image002.jpg”
Content-Transfer-Encoding: Base64
Content-Id-: [email protected]

Content-Type: image/jpeg; name=“image001.jpg”
Content-Transfer-Encoding: Base64
Content-Id-: [email protected]

Content-Type: image/jpeg; name=“image002.jpg”
Content-Transfer-Encoding: Base64
Content-Id-: [email protected]

Content-Type: image/jpeg; name=“image006.jpg”
Content-Transfer-Encoding: Base64
Content-Id-: [email protected]

–output:–
name=“image002.jpg”
Content-Transfer-Encoding: Base64
Content-Id-: [email protected]

gflamino · June 30, 2011, 7:12pm

Or, if it’s more convenient you can use StringIO the same way:

require ‘stringio’

str = <<END_OF_DATA
Content-Type: image/jpeg; name=“image001.jpg”
Content-Transfer-Encoding: Base64
Content-Id-: [email protected]

Content-Type: image/jpeg; name=“image002.jpg”
Content-Transfer-Encoding: Base64
Content-Id-: [email protected]

Content-Type: image/jpeg; name=“image001.jpg”
Content-Transfer-Encoding: Base64
Content-Id-: [email protected]

Content-Type: image/jpeg; name=“image002.jpg”
Content-Transfer-Encoding: Base64
Content-Id-: [email protected]

Content-Type: image/jpeg; name=“image006.jpg”
Content-Transfer-Encoding: Base64
Content-Id-: [email protected]
END_OF_DATA

f = StringIO.new(str)

while para = f.gets(‘’)
…

gflamino · July 4, 2011, 6:58am

Thank you Brian. Works great!

gflamino · July 1, 2011, 2:27pm

Gil Flamino wrote in post #1008364:

Rubular.com permalink: Rubular: name=\"image002.jpg\".+Content-Id-: <[email protected]>

Any help would be greatly appreciated!

If you know that the Content-Id is always two lines after the name:

/name="image002.jpg".\n.\nContent-Id-:
[email protected]/i

This may be reasonable, given that you’re happy to hard-code that the
Content-Type comes before the Content-Id.

Otherwise, just do it in chunks:

res = str.split(/\n\n/).find { |x| x =~ /whatever/ }

This also allows you to handle cases where the fields are not in a fixed
order:

res = str.split(/\n\n/).find { |x|
x =~ /name=“image002.jpg”/i &&
x =~ /Content-Id-: [email protected]/i
}