Regexp.escape with un-escapes

Hi–

I want to translate a string into a regular expression, but I want to
“un-escape” portions as raw regexp. For example:

“here is a setting ‘a’ equal to ((\d+))”

So I want to Regexp.escape the string, before I pass it to Regexp.new,
but I want what’s in the (( )) to stay exactly the same with the
double-parens removed.

I know that has to be a fairly concise way to do this, but all I’ve
come up with is some very ugly brute force code that iterates back and
forth using index ‘((’ and index ‘))’.

Any suggestions?

Thomas S. wrote:

Hi–

I want to translate a string into a regular expression, but I want to
“un-escape” portions as raw regexp. For example:

“here is a setting ‘a’ equal to ((\d+))”

So I want to Regexp.escape the string, before I pass it to Regexp.new,
but I want what’s in the (( )) to stay exactly the same with the
double-parens removed.

I know that has to be a fairly concise way to do this, but all I’ve
come up with is some very ugly brute force code that iterates back and
forth using index ‘((’ and index ‘))’.

Any suggestions?

Why, yes! Use a regexp to find the (( )) bits, and then extract them,
don’t escape them, and paste the whole regexp together. It’s a bit
ironic to me that in a regexp-handling routine, you’re doing brute-force
index searches. :slight_smile:

Best,
–Â
Marnen Laibow-Koser
http://www.marnen.org
[email protected]

On Dec 6, 10:33 am, Marnen Laibow-Koser [email protected] wrote:

Why, yes! Use a regexp to find the (( )) bits, and then extract them,
don’t escape them, and paste the whole regexp together. It’s a bit
ironic to me that in a regexp-handling routine, you’re doing brute-force
index searches. :slight_smile:

You are right, it is ironic! Your suggestion helped some. I came up
with:

def __when_string_to_regexp(str)
  rexps = []
  str = str.gsub(/\(\((.*?)\)\)/) do |m|
    rexps << ['(' + $1 + ')']
    "%s"
  end
  str = Regexp.escape(str)
  str = str % rexps
  str = str.gsub(/(\\\ )+/, '\s+')
  Regexp.new(str, Regexp::IGNORECASE)
end

It’s still not perfect b/c it means end users won’t be able to use
“%s” in a string without it messing up --stitching it back together
seems to be the hard part. But I’ll keep working on it.

Thanks.

There is a feature in String#split (documented in 1.9, undocumented in
1.8.6 but still present) whereby if the split RE contains capture
groups, those capture groups will be included in the resulting array.

s = ‘here is (a) setting ‘a’ equal to ((\d+))’
=> “here is (a) setting ‘a’ equal to ((\d+))”

Regexp.new(s.split(/(((.?)))/).map { |x| x =~ /\A(((.)))\z/ ? $1 : Regexp.escape(x) }.join)
=> /here\ is\ (a)\ setting\ ‘a’\ equal\ to\ (\d+)/

On Dec 6, 12:45 pm, Brian C. [email protected] wrote:

There is a feature in String#split (documented in 1.9, undocumented in
1.8.6 but still present) whereby if the split RE contains capture
groups, those capture groups will be included in the resulting array.

s = ‘here is (a) setting 'a' equal to ((\d+))’

=> “here is (a) setting ‘a’ equal to ((\d+))”>> Regexp.new(s.split(/(((.?)))/).map { |x| x =~ /\A(((.)))\z/ ? $1 : Regexp.escape(x) }.join)

=> /here\ is\ (a)\ setting\ ‘a’\ equal\ to\ (\d+)/

Weird, but cool. Does the trick. Thanks.