Nginx-0.8.11

imixaly4 · August 30, 2009, 6:21am

I would recommend (if you have windows) “the regex coach”. In the top
you put your regex, in the bottom your text to match, and it’ll show
you what is going on. My first guess is that nginx regex syntax is
different from perl, but I could also have made a mistake or two.

The regex I wrote, in human-speak, is:

$url =~ s#/[^/].?([0-9])..html$#/showthread.php?t=$1#i;

$url =~ s

(substitution regex against $url variable)

# #i;

using # as the delimiter since we’re using slash in our regex, case
insensitive matching

/[^/].*

start with a slash, then match every character afterwards that is not
a slash, but as few characters as possible

([0-9]).*

ok that was an error of mine, it should have read:

([0-9].*)

which is, match a group of characters whose values can contain digits
0-9, which will be stored in $1, as many characters as you can find
that match.

.html$

then the period character, then “html”, at the end of the line ($
means end of line)

#/showthread.php?t=$1#

replace what we matched in the first part with what follows:

/showthread.php?t=

and then whatever we matched earlier in ([0-9].*)

So the new regex after fixing my error would be:

$url =~ s#/[^/].?([0-9].).html$#/showthread.php?t=$1#i;

In perl anyway.

imixaly4 · August 30, 2009, 3:20am

Well… first off, any major change to software like that is going to
change the url patterns and break your links, whether you’re using
friendly looking links or showthread kind of links.

More to your question, it looks like a rewrite or redirect is your
answer to that question

A regex like

$url =~ s#/[^/].?([0-9])..html$#/showthread.php?t=$1#i;

would do the trick in perl.

imixaly4 · August 30, 2009, 8:07am

AMP Admin wrote:

Well, I coulndn’t get that to work. I’m not too good with regex stuff.

Anyone wanna give me an assist on the following?

I need
http://www.example.com/anyone-doing-late-ridei-t19640.html

to go to
http://www.example.com/showthread.php?t=19640

server {
server example.com;

location ~ ^.*-t([0-9]+).html$ {
   fastcgi_pass   ...;
   fastcgi_param SCRIPT_FILENAME

/path/to/document/root/showthread.php?t=$1;
}
}

Is one possibility.

If the document root changes, then you might want to use the
$document_root variable like:

fastcgi_param SCRIPT_FILENAME $document_root/showthread.php?t=$1;

but if you can put it in statically, it’ll be slightly better
performace-wise.

Note, the above regex assumes that all the articles have
-t[thread_number] at the end.

Marcus.

imixaly4 · August 30, 2009, 4:34pm

Thanks, I’ll be checking out “the regex coach” for sure!

imixaly4 · August 30, 2009, 2:54pm

Well… first off, any major change to software like that is going to
change the url patterns and break your links, whether you’re using
friendly looking links or showthread kind of links.

Actually I’ll look at SEO-friendly URL’s as an advantage here. Whether
you
have to rewrite 1 rule to automagically use the new url pattern or half
a
million it’s all about making it easy for your users/customers. You
don’t
want to make it difficult for them because /you/ changed software. If
they
go directly to you buy-now page, you want to get there directly and not
present a search page saying, ‘we changed software, please search for
the
buy-now page, which has been relocated due to change in software (and we
couldn’t bother to send you there automatically as it costs us to much
and
you will your spend money with us anyway)’.

That is, of course:

if you’re selling products/services which are also easily bought
somewhere else. It’s all about the cost/advantage ratio.
if you linked from elsewhere. If everybody enters your site using the
home-page, it’s not a problem either.

Regards,

Martin

imixaly4 · August 30, 2009, 4:34pm

I think something’s missing from that so I’ve tried the following but no
luck. Thoughts?

location ~ ^.-t([0-9]+).html$ {
fastcgi_index index.php;
fastcgi_pass 127.0.0.1:2000;
include fastcgi_params;
fastcgi_intercept_errors On;
fastcgi_ignore_client_abort On;
fastcgi_buffer_size 128k;
fastcgi_buffers 4 128k;
rewrite ~ ^.-t([0-9]+).html$ /path/to/stuff/showthread.php?t=$1
last;
}

imixaly4 · August 30, 2009, 5:03pm

AMP Admin wrote:

I think something’s missing from that so I’ve tried the following but no
luck. Thoughts?

You will need to add the fastcgi_pass setting relevant for you. The
other settings are optional.

If you’ve used

include fastcgi_params;

then you’ve probably got a setting

fastcgi_param SCRIPT_FILENAME xxx

in there. This will over-ride the one mentioned below, so it won’t
work. You need to remove it.

There’s no need to do a rewrite - setting the script filename from the
regex location is more efficient.

As an extra note, having fastcgi_index here will have no effect - the
regex ends in .html, so would never be used.

Try :

location ~ -t([0-9]+).html$ {
fastcgi_pass 127.0.0.1:2000;
fastcgi_param SCRIPT_FILENAME
/path/to/document/root/showthread.php?t=$1;
include fastcgi_params; # no fastcgi_param SCRIPT_FILENAME
setting in this file!!!
fastcgi_intercept_errors On;
fastcgi_ignore_client_abort On;
fastcgi_buffer_size 128k;
fastcgi_buffers 4 128k;
}

Marcus.

imixaly4 · August 30, 2009, 6:03pm

AMP Admin wrote:

I ended up getting it to work with just the following line. No location or
anything.

rewrite ^.*-t([0-9]+).html$ /showthread.php?t=$1 last;

You might want to make that “last” into a “permanent”.

That way a 301 is returned and you retain search engine ranking for the
page.

imixaly4 · August 30, 2009, 7:14pm

Oh great idea… thank you!

imixaly4 · August 30, 2009, 5:51pm

I ended up getting it to work with just the following line. No location
or
anything.

rewrite ^.*-t([0-9]+).html$ /showthread.php?t=$1 last;

thanks everyone for your help!

imixaly4 · August 30, 2009, 10:47pm

Awesome… thanks

imixaly4 · August 30, 2009, 9:53pm

Glad it worked out.

One last thing of note is that “.” equals “any single character”,
whereas in your example i think you want it to match a literal period,
which is done via . In the example above this shouldn’t cause any
problems that I can forsee, as a period is a single character that is
matched by “.”, but I thought you might want to be aware of this.

imixaly4 · August 31, 2009, 4:41am

Shoot guys… I’m back.

I didn’t think abt page 2,3, and so on. Check out the p2 on the
following.
These aren’t getting caught.

guys-your-favorite-dji-t27822p2.html

Regards,

-Team AMP
http://www.ampprod.com

imixaly4 · August 31, 2009, 4:51am

What exactly is google_perftools_profiles and is it something we can
benefit
from in your opinion?

Regards,

-Team AMP
http://www.ampprod.com

imixaly4 · August 31, 2009, 6:41am

what’s the intended destination url formatting?

imixaly4 · August 31, 2009, 9:02am

On Sun, Aug 30, 2009 at 09:44:02PM -0500, AMP Admin wrote:

What exactly is google_perftools_profiles and is it something we can benefit
from in your opinion?

The Google PerfTools is mostly developer’s option.

imixaly4 · August 31, 2009, 5:10am

AMP Admin wrote:

What exactly is google_perftools_profiles and is it something we can benefit
from in your opinion?

Have you tried Google?

http://code.google.com/p/google-perftools/wiki/GooglePerformanceTools

Now as to whether you could benefit from it, that’s for you to decide
(though I’m thinking not…).

Jim

imixaly4 · August 31, 2009, 10:24pm

yupp, that’s right. It’s a bit more complicated of a regex…

probably what you had before, but, with something like this at the end

(p([0-9]*))?.html$

where we have one set of grouping symbols so we can have the ? at the
end, to make the match optional. so $2 is not needed, and $3 is your
page value.

imixaly4 · August 31, 2009, 10:18pm

Hi again Gabriel,

It would be

guys-your-favorite-dji-t27822p2.html → showthread.php?t=27822&page=2

I guess anytime it finds pX it should go to &page=X ?

Regards,

-Team AMP
http://www.ampprod.com

imixaly4 · August 31, 2009, 11:08pm

On Mon, Aug 31, 2009 at 03:58:28PM -0500, AMP Admin wrote:

domain was entered into the browser.
You need to change

-fastcgi_param SERVER_NAME $server_name;
+fastcgi_param SERVER_NAME $host;

in fastcgi_params.