On Wed, Dec 11, 2013 at 10:58 AM, Xavier N. [email protected] wrote:
The regexp says: if you match either " o ', then countinue matching as long
as you do not find the matched quote, and until you find the closing quote
(needed because you could reach end of file with an unbalanced quote).
The second group has the string without quotes.
Interesting solution! I also tried
("|')([^\1]*)\1
which looked fine initially
irb(main):025:0> “foo ‘bar’ "baz"
buz”.scan(/("|')([^\1]*)\1/).map(&:last)
=> [“bar”, “baz”]
but broke later:
irb(main):030:0> “foo ‘bar’ "baz" buz "bongo’s
kongo"”.scan(/(“|')([^\1]*)\1/)
=> [[”‘", "bar’ "baz" buz "bongo"]]
where your solution still works:
irb(main):031:0> “foo ‘bar’ "baz" buz "bongo’s
kongo"”.scan(/(“|')((?:(?!\1).)*)\1/)
=> [[”'“, “bar”], [”"“, “baz”], [”"", “bongo’s kongo”]]
However, we can also use non greediness to achieve the same:
irb(main):032:0> “foo ‘bar’ "baz" buz "bongo’s
kongo"”.scan(/("|')(.?)\1/)
=> [[“'”, “bar”], [“"”, “baz”], [“"”, “bongo’s kongo”]]
irb(main):033:0> “foo ‘bar’ "baz" buz "bongo’s
kongo"”.scan(/("|')(.?)\1/).map(&:last)
=> [“bar”, “baz”, “bongo’s kongo”]
Adding some escaping capabilities we get ("|')((?:\.|(?!\1).)*)\1
irb(main):038:0> “foo ‘bar’ "baz" buz "bongo’s kongo" gingo said
"foo \" bar" yes”.scan(/("|')((?:\.|(?!\1).)*)\1/).map(&:last)
=> [“bar”, “baz”, “bongo’s kongo”, “foo \" bar”]

Kind regards
robert