-----BEGIN PGP SIGNED MESSAGE-----

Hash: SHA1

I am trying to write a simple script to parse

an Apache log file, but it is taking an extremely

long time. I used profile and the problem appears

to be with the regular expression matcher.

I have made a simple script to run it with different

lengths and it appears that the regular expression

matcher is being very slow. See the attached script.

Here are some timings:

~>time ./mklong.rb 1000

real 0m11.930s

~>time ./mklong.rb 2000

real 0m50.400s

~>time ./mklong.rb 3000

real 1m55.693s

~>time ./mklong.rb 4000

real 3m16.004s

So, dividing it out,

1000/11 = 91

2000/50 = 40

3000/120 = 25

4000/200 = 20

So, the matching appears to be much slower than O(n).

Isn’t the whole point of regular expressions to be

fast and O(n)?

Whenever my script encounters a long string,

it grinds to a halt.

Why is this?

Did I make the regular expression correctly?

Is there some way to optimize it?

Is there a problem with the matcher?

Any help would be appreciated.

-----BEGIN PGP SIGNATURE-----

Version: GnuPG v1.4.2.2-ecc0.1.6 (GNU/Linux)

Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFExRiSxzVgPqtIcfsRAmJGAJ9FmvUTT7Q0692yIVvexWoSvg8FDQCdGBgZ

5R2ieCXoflMUgiwYVCQuMaI=

=d8Zp

-----END PGP SIGNATURE-----