Regexp removing {xxx }

I’m trying to treat LaTeX files - is there a trick to remove occurrences
of {\large … } from strings like this one:

{\large this is a fa\c{c}ade,} this too: {\large façade}

so that it becomes:

this is a fa\c{c}ade, this too: façade

2012/3/2 Wybo D. [email protected]:

I’m trying to treat LaTeX files - is there a trick to remove occurrences
of {\large … } from strings

I assume that curly braces can be arbitrarily nested within the \large
sections. In this case the problem in general form can’t be solved
using simply regular expressions - you’d have to actually parse the
LaTeX, or use a crazy flavor such as .NET’s one that actually allows
it.

However. If there’s a limit in the nesting - for example you can tell
that nowhere in the text curlies within \large are nested more than
one deep, there are solutions. Let’s start with simplest regex:
/{\large [^{}]+}/. This works for your second example, but fail for
first. But we can fix it: consider this regex: /{\large
(?:[^{}]+|{[^}]+})+}/. The parenthesized part will match either a
lot of non-brace characters, or a lot of non-brace characters
surrounded by one pair of braces, and then repeat it - and it works
for both of your examples. If you wan’t to go deeper ;), you can once
again stick another alternative into the alt. with curlies - this
works to any depth (I’ll leave it as exercise to the reader :slight_smile: ). Of
course, there can always be constructed input on which this method
will fail, so you have to be ready to either accept that it stops
working correctly sometimes, or somehow pre-control your input.

When constructing regexes like this, you have to be careful to avoid
catastrophic backtracking, or matching your regex in worst case will
have exponential complexity.

Here’s a nice tool to try these regexps: http://regexpal.com/

– Matma R.

On 2012-03-02 15:47, Bartosz Dziewoński wrote:

Let’s start with simplest regex:
/{\large [^{}]+}/. This works for your second example, but fail for
first. But we can fix it:
consider this regex: /{\large(?:[^{}]+|{[^}]+})+}/.

Thanks! This is very useful, albeit a braintwister…
And I like the http://regexpal.com/