Problems with fastcgi php migration


#1

Thought I’d rename this thread since the problem has morphed beyond the
original thread “Can fastcgi_index be used with multiple filenames?”.
I’ll reiterate some things first and then pass on a few things I’ve
discovered this afternoon that are new roadblocks.

Again, the site has been humming along for months as a static serving
nginx frontend passing everything else back to Apache handling PHP.

In our apache setup we handle .php, .shtml and a few extensionless files
as PHP. Examples of extensionless filenames include galleries, polls,
reviews, etc.

When I speak of the extensionless files, I mean that:

/galleries/1/123 is actually a PHP script file named galleries that
takes the PATH_INFO (we use cgi.fix-pathinfo = 1) of /1/123 and works
with that to create a page. Other examples would be /reviews/1,
/polls/12…but, see #3 below.

When I tried to take Apache out of the question I changed the test.conf
with:

location / {
root /usr/local/apache/htdocs;
index index.shtml index.php;
include /usr/local/nginx/conf/fastcgi.conf;
fastcgi_pass 127.0.0.1:10004;
}

location ~* ^.+.(jpg|…static file list…)$ {
root /usr/local/apache/htdocs;
error_page 403 /dhe403.shtml;
expires 30d;
valid_referers none blocked *.example.com example.com;
if ($invalid_referer) {
return 403;
}
}

The idea was that anything that wasn’t a static file was being passed on
to be handled by fastcgi.

When I first posted about this early Saturday morning, I was having no
problems having fastcgi run /galleries/1/123 but it wouldn’t try to
locate either index.shtml or index.php in real directory situations like
example.com/ or example.com/webmail. Actually going to
example.com/index.shtml or example.com/webmail/index.php would run
through fastcgi but not specifying them wouldn’t, I’d get “No input file
specified.”

Igor suggested adding:

location .(php|shtml)$ {
fastcgi_pass …

fastcgi_index is not needed here at all

}

after the location / {

}

block but that didn’t work. We then tried:

location ~ .(php|shtml)$ {

But that didn’t work either. Still “No input file specified” when it
should be looking for index.shtml or index.php.

A few other things I just noticed:

  1. It looks like some of the extensionless files need:
    $_SERVER[‘PATH_INFO’] = substr( $_SERVER[‘REQUEST_URI’],
    strlen($_SERVER[‘SCRIPT_NAME’] ) ); to work with PATH_INFO properly.

  2. I have a custom 404 set up in the server area:

error_page 404 /dhe404.shtml;

It’s a PHP file as well. When I tried to get a nonexistent page, it too
came back with “No input file specified”.

  1. On /testgalleries/123/444 we get:
    _SERVER[“PATH_INFO”] /123/444

but on /testgalleries by itself we get:
_SERVER[“PATH_INFO”] /testgalleries

Obviously this is a bit of a problem as it should be empty.

Thanks for any advice. Can’t wait until I move the whole site over to
nginx and drop Apache.


#2

Ian,
Not to start a sub-thread or divert the course of the conversation,
but…

It strikes me that you could potentially save yourself a lot of
headaches involving stripping out path information to generate a
filename by having the things like galleries, polls, etc actually be
proxied back to a Mongrel cluster running a small Rails app. URLs like
the ones you cited are handled near-natively by the Rails URL parsing
mechanism. Just judging from the names, it doesn’t sound like it would
be a horribly horrendous conversion process either.

Alternatively, a few rewrite directives might be in order. I’m by no
means an expert and don’t want to steer you the wrong way by giving
flawed examples.

Just a thought.

Philip Ratzsch
Software Engineer Developer I
Information Systems, Rackspace

Opinions expressed are mine and do not necessarily reflect those of my
employer.

Confidentiality Notice: This e-mail message (including any attached or
embedded documents) is intended for the exclusive and confidential use
of the
individual or entity to which this message is addressed, and unless
otherwise
expressly indicated, is confidential and privileged information of
Rackspace.
Any dissemination, distribution or copying of the enclosed material is
prohibited.
If you receive this transmission in error, please notify us immediately
by e-mail
at removed_email_address@domain.invalid, and delete the original message.
Your cooperation is appreciated.


#3

Philip Ratzsch wrote:

It strikes me that you could potentially save yourself a lot of headaches involving stripping out path information to generate a filename by having the things like galleries, polls, etc actually be proxied back to a Mongrel cluster running a small Rails app. URLs like the ones you cited are handled near-natively by the Rails URL parsing mechanism. Just judging from the names, it doesn’t sound like it would be a horribly horrendous conversion process either.

That’s one idea, but right now I don’t want to add a new layer in.

Alternatively, a few rewrite directives might be in order. I’m by no means an expert and don’t want to steer you the wrong way by giving flawed examples.

You mean something along the lines of a location specifically for the
extensionless filenames (galleries|polls|reviews|etc) with
fastcgi_params just for them? And another location for php|shtml?

[Note how I don’t even try to write the regex’s…I truly, truly suck at
understanding them, and as Blanche DuBois said in “A Streetcar Named
Desire”, “I rely on the kindness of strangers.”]


#4

Ian & all,

That’s one idea, but right now I don’t want to add a new layer in.

Fair enough. If at some point you decide to go that route, know that
it’s amazingly easy to configure.

You mean something along the lines of a location specifically for the
extensionless filenames (galleries|polls|reviews|etc) with
fastcgi_params just for them? And another location for php|shtml?
[Note how I don’t even try to write the regex’s…(clip)]

I’ve never done anything with FCGI, so rather than risk using the wrong
terminology, I’ll just try my hand at an example.

[Caveat - in my limited experience, the rewrite engine in Nginx uses a
very similar syntax to Apache’s mod_rewrite module, so I’m hoping this
is close enough that someone with more knowledge than I can correct any
mistakes. This does not take proxying into account, and simply serves
as a logic example. These are not the droids you’re looking for.]

If for example, your ‘gallery’ section took two parameters, a
three-digit number and a two-digit number, the logic would be:

A line that matches ^/gallery/(\d{3})/(\d{2})$
(Start matching when you see ‘/gallery/xxx/yy’ and then stop)

…would be proxied to
http://your_upstream_server/file_that_handles_gallery_requests?three-digit-number=$1&two-digit-number=$2

The same thing could be done for polls, again for reviews, et al.

In the above regex, notice that the \d{3} and \d{2} which match three
digits and two digits respectively, are wrapped in parentheses. The
parentheses allow you to use what are called back references ($1 and $2
in this case), which allow you to grab the chunks of the URL that match
that specific chunk of the pattern and use them later. An example might
be better: if you had a URL of http://blah.com/gallery/222/33, $1 would
represent 222 and $2 would be 33.

I hope I haven’t completely obfuscated what I’m trying to communicate.
I think that the link below documents a guy having a similar problem, or
at least one that may be close enough for your purposes.

http://www.ruby-forum.com/topic/144179

I hope this is of some help.

Philip Ratzsch
Software Engineer Developer I
Information Systems, Rackspace

Opinions expressed are mine and do not necessarily reflect those of my
employer.

Confidentiality Notice: This e-mail message (including any attached or
embedded documents) is intended for the exclusive and confidential use
of the
individual or entity to which this message is addressed, and unless
otherwise
expressly indicated, is confidential and privileged information of
Rackspace.
Any dissemination, distribution or copying of the enclosed material is
prohibited.
If you receive this transmission in error, please notify us immediately
by e-mail
at removed_email_address@domain.invalid, and delete the original message.
Your cooperation is appreciated.


#5

Philip Ratzsch wrote:

[Caveat - in my limited experience, the rewrite engine in Nginx uses a very similar syntax to Apache’s mod_rewrite module, so I’m hoping this is close enough that someone with more knowledge than I can correct any mistakes. This does not take proxying into account, and simply serves as a logic example.
These are not the droids you’re looking for.]

Killed myself on the droids bit…also daylight saving kicked in and 2AM
just became 3AM, so more coffee.

If I get what you’re saying, you’re suggesting creating a new .php file
that, rather than looking for a PATH_INFO that may or may not be mangled
through the FCGI process instead gets translated into ?variable=whatever
in a rewrite from the old extensionless URI. Visitor still sees
/galleries/123 but nginx translates it into galleries.php?event=123

I wouldn’t need numeric length specifics as they can be any lenght and
can be there or not.

e.g. /galleries gives the user a main menu
/galleries/[num] gives them an event (like a film festival)
/galleries/[num]/[num] gives them a subevent listing (like a premiere
photo listing at a film festival)
and /galleries/[num]/[num]/[num] gives them a specific photo.

Fun and games. As I said people, regex is below a weakness for me so if
there are any suggestions, I’m open.

At least the site is, and has been, fully functional in the Apache
backend format…would love to go fully nginx soon though.


#6

If this should be taken off the list (as it contains PHP), I can be
emailed directly at removed_email_address@domain.invalid . If so, I apologize.

e.g. /galleries gives the user a main menu…
/galleries/[num] gives them an event (like a film festival)
/galleries/[num]/[num] gives them a subevent listing (like a premiere
photo listing at a film festival)
and /galleries/[num]/[num]/[num] gives them a specific photo.

Granted it’s late, so I may be off the mark, but it sounds to me like
something like this may work on the backend:

$a = explode(’/’,$_SERVER[‘REQUEST_URI’]); // break apart the URI on ‘/’
array_shift($a); // the first element will be null, so dump it
$i = array_search(‘gallery’,$a); //find where ‘gallery’ occurs in the
array
$inc = $i; // set up for the loop
while( $inc >= $i) // knock off elements until all that’s left is the
stuff after ‘gallery’
{
array_shift($a);
$inc–;
}

If you ran ‘/gallery/222/33’ through this, you should be left with an
array containing elements ‘222’ and ‘33’, which you could then use to
assemble the path to the file you want to serve. This might be handy as
it’s immune to variable length URIs like you mentioned.

Is this what you were looking for Ian?

Philip Ratzsch
Software Engineer Developer I
Information Systems, Rackspace

Opinions expressed are mine and do not necessarily reflect those of my
employer.

Confidentiality Notice: This e-mail message (including any attached or
embedded documents) is intended for the exclusive and confidential use
of the
individual or entity to which this message is addressed, and unless
otherwise
expressly indicated, is confidential and privileged information of
Rackspace.
Any dissemination, distribution or copying of the enclosed material is
prohibited.
If you receive this transmission in error, please notify us immediately
by e-mail
at removed_email_address@domain.invalid, and delete the original message.
Your cooperation is appreciated.


#7

I’m using CodeIgniter (a PHP framework) with “pretty URLs” which comes
close to what you want to achieve. This is what my server section
looks like:

server {

include conf/php-fcgi.conf;

            location / {
                    index           index.php index.html;
                    if (-f $request_filename) {
                            break;
                    }
                    if (!-f $request_filename) {
                            rewrite ^/(.*)$ /index.php/$1 last;
                    }
            }


}

and php-fcgi.conf:

    location ~ .*\.php.*$ {
            fastcgi_param  GATEWAY_INTERFACE  CGI/1.1;
            fastcgi_param  SERVER_SOFTWARE    nginx;
            fastcgi_param  QUERY_STRING       $query_string;
            fastcgi_param  REQUEST_METHOD     $request_method;
            fastcgi_param  CONTENT_TYPE       $content_type;
            fastcgi_param  CONTENT_LENGTH     $content_length;
            fastcgi_param  SCRIPT_FILENAME

$document_root$fastcgi_script_name;
fastcgi_param PATH_TRANSLATED
$document_root$fastcgi_script_name;
fastcgi_param REQUEST_URI $request_uri;
fastcgi_param DOCUMENT_URI $document_uri;
fastcgi_param DOCUMENT_ROOT $document_root;
fastcgi_param SERVER_PROTOCOL $server_protocol;
fastcgi_param REMOTE_ADDR $remote_addr;
fastcgi_param REMOTE_PORT $remote_port;
fastcgi_param SERVER_ADDR $server_addr;
fastcgi_param SERVER_PORT $server_port;
fastcgi_param SERVER_NAME $server_name;
fastcgi_pass 127.0.0.1:9001;
fastcgi_index index.php;
}

This allows me to directly refer a php file in any location and get it
executed by fastcgi, and at the same time any non-existing file (like
/galleries/1/2/3 in your example) is matched by “if (!-f
$request_filename)” and gets passed to fastcgi as
“/index.php/galleries/1/2/3”. Static content is handled directly by
nginx. I think you could use a similar setup with several location
entries that reference a different index.(shtml|php) in the second
“if”.


#8

Philip Ratzsch wrote:

If this should be taken off the list (as it contains PHP), I can be emailed directly at removed_email_address@domain.invalid . If so, I apologize.

Well, actually, I already explode the path_info on the Apache backend as
this is working on a production server. The only problem I’m having is
getting the process to work using nginx and php fastcgi…


#9

Carlos wrote:

This allows me to directly refer a php file in any location and get it
executed by fastcgi, and at the same time any non-existing file (like
/galleries/1/2/3 in your example) is matched by "if (!-f

Thanks for the example.

My only problem is that galleries EXISTS, that is, it’s an extensionless
file containing a php script.

The one odd thing that’s happening is madness with the path_info.

When I access galleries/1/123 the path_info is /1/123 which I can easily
explode.

When I access /galleries, the path_info suddenly becomes
/galleries/galleries which is nutty…somehow the path_info gets screwed
up in the fastcgi transition.

Perhaps I’ll take a look at doing it your way.


#10

One possible solution to the path_info issue:

In the fastcgi.conf add:

if ($fastcgi_script_name ~ “^/galleries(/.+)$”) {
set $path_info $1;
}
fastcgi_param PATH_INFO $path_info;

Doing that fixed the path_info issue i.e.

Before fix:
/galleries/123 _SERVER[“PATH_INFO”]=/123
/galleries _SERVER[“PATH_INFO”]=/galleries (should be blank)

After fix:
/galleries/123 _SERVER[“PATH_INFO”]=/123
/galleries/123 _SERVER[“PATH_INFO”]=empty

I’ll try adding a few of those for the extensionless files. Still need
to figure out why it’s not finding the index.shtml and index.php files.


#11

On Sun, Mar 09, 2008 at 10:16:14AM +0100, Carlos wrote:

                    if (-f $request_filename) {
                            break;
                    }
                    if (!-f $request_filename) {
                            rewrite ^/(.*)$ /index.php/$1 last;
                    }
            }

NO. NO. NO. Do not use if/rewrite unless you really need it.

   location / {
       index            index.php index.html;
       error_page  404  = /index.php$uri;
       log_not_found    off;
   }

#12

On Sun, Mar 09, 2008 at 06:42:27PM -0400, Ian M. Evans wrote:

With nginx fronting apache:
“GET /emmy/ HTTP/1.1” 200 3042

Otherwise things seem to be working as expected. So the outstanding
issues are:

  1. Can’t find index
  2. Isn’t parsing custom 404 as PHP.

Thanks to all…I’m getting there. :slight_smile:

Is nginx document root the same as PHP’s on ?


#13

As I wrote earlier:

Still need to figure out why it’s not finding the index.shtml and index.php files.

Not sure why it’s ignoring the index index.shtml index.php

I’ve set up a test server at :8088 that’s nginx/fastcgi

When I try to get it to find the index.shtml at /emmy I get:
“GET /emmy/ HTTP/1.1” 404 36

With nginx fronting apache:
“GET /emmy/ HTTP/1.1” 200 3042

Otherwise things seem to be working as expected. So the outstanding
issues are:

  1. Can’t find index
  2. Isn’t parsing custom 404 as PHP.

Thanks to all…I’m getting there. :slight_smile:


#14

On Mon, Mar 10, 2008 at 06:03:18AM -0400, Ian M. Evans wrote:

Igor S. wrote:

On Sun, Mar 09, 2008 at 06:42:27PM -0400, Ian M. Evans wrote:

  1. Isn’t parsing custom 404 as PHP.

Found fastcgi_intercept_errors on; so #2 is fixed.

Is nginx document root the same as PHP’s on ?

Do you mean a setting in php.ini?

No, I mean

location / {
root ONE;
}

location ~ .php$ {
root ONE;

    fastcgi_param  SCRIPT_FILENAME  /ONE$fastcgi_script_name;
    # or
    fastcgi_param  SCRIPT_FILENAME 

$document_root$fastcgi_script_name;

}


#15

Igor S. wrote:

On Sun, Mar 09, 2008 at 06:42:27PM -0400, Ian M. Evans wrote:

  1. Isn’t parsing custom 404 as PHP.

Found fastcgi_intercept_errors on; so #2 is fixed.

Is nginx document root the same as PHP’s on ?

Do you mean a setting in php.ini?


#16

Igor S. wrote:

location / {
root ONE;
}

location ~ .php$ {
root ONE;

Here are the locations. / passes on to php as well, because any file
that isn’t a graphic or .js or .css is processed by PHP. .shtml is the
main .php extension:

location / {
root /usr/local/apache/htdocs;
index index.shtml index.php;
include /usr/local/nginx/conf/fastcgi.conf;
fastcgi_pass 127.0.0.1:10004;
}

location ~ .(php|shtml)$ {
root /usr/local/apache/htdocs;
include /usr/local/nginx/conf/fastcgi.conf;
fastcgi_pass 127.0.0.1:10004;
}

location ~* ^.+.(jpg|…other static extensions…)$ {
root /usr/local/apache/htdocs;
error_page 403 /dhe403.shtml;
expires 30d;
valid_referers none blocked *.example.com example.com ;
if ($invalid_referer) {
return 403;
}
}

fastcgi.conf:

fastcgi_param GATEWAY_INTERFACE CGI/1.1;
fastcgi_param SERVER_SOFTWARE nginx;
fastcgi_param QUERY_STRING $query_string;
fastcgi_param REQUEST_METHOD $request_method;
fastcgi_param CONTENT_TYPE $content_type;
fastcgi_param CONTENT_LENGTH $content_length;
fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
fastcgi_param PATH_TRANSLATED $document_root$fastcgi_script_name;
fastcgi_param REQUEST_URI $request_uri;
fastcgi_param DOCUMENT_URI $document_uri;
fastcgi_param DOCUMENT_ROOT $document_root;
fastcgi_param SERVER_PROTOCOL $server_protocol;
fastcgi_param REMOTE_ADDR $remote_addr;
fastcgi_param REMOTE_PORT $remote_port;
fastcgi_param SERVER_ADDR $server_addr;
fastcgi_param SERVER_PORT $server_port;
fastcgi_param SERVER_NAME $server_name;
#fastcgi_param PATH_INFO $fastcgi_script_name;
set $path_info “”;
if ($fastcgi_script_name ~ “^/cr(/.+)$”) {
set $path_info $1;
}
if ($fastcgi_script_name ~ “^/evans(/.+)$”) {
set $path_info $1;
}
if ($fastcgi_script_name ~ “^/galleries(/.+)$”) {
set $path_info $1;
}
if ($fastcgi_script_name ~ “^/news(/.+)$”) {
set $path_info $1;
}
if ($fastcgi_script_name ~ “^/poll(/.+)$”) {
set $path_info $1;
}
if ($fastcgi_script_name ~ “^/posters(/.+)$”) {
set $path_info $1;
}
if ($fastcgi_script_name ~ “^/photos(/.+)$”) {
set $path_info $1;
}
if ($fastcgi_script_name ~ “^/profile(/.+)$”) {
set $path_info $1;
}
if ($fastcgi_script_name ~ “^/review(/.+)$”) {
set $path_info $1;
}
if ($fastcgi_script_name ~ “^/shop(/.+)$”) {
set $path_info $1;
}
if ($fastcgi_script_name ~ “^/evansabove(/.+)$”) {
set $path_info $1;
}
fastcgi_param PATH_INFO $path_info;
fastcgi_param REDIRECT_STATUS 200;


#17

Igor,

Changing the config to:

location / {
root /usr/local/apache/htdocs;
index index.shtml index.php;
}

location ~ .(shtml|php)$ {
root /usr/local/apache/htdocs;
include /usr/local/nginx/conf/fastcgi.conf;
fastcgi_pass 127.0.0.1:10004;
}

fixed the problem of not finding the index.shtml or .php files BUT it
BROKE the passing of the extensionless files like galleries, polls, etc
to the fastcgi server.

In the current Apache backend, I have a bunch of files directives like:

ForceType application/x-httpd-php ForceType application/x-httpd-php ForceType application/x-httpd-php

which handle the extensionless php files.

In my nginx.conf, do I need to explicitly state these extensionless PHP
files (there’s 10 of them) in a location directive(s) and have that
passed to the fastcgi?

At least I feel a bit closer now…


#18

Ian M. Evans wrote:

Here are the locations. / passes on to php as well, because any file
that isn’t a graphic or .js or .css is processed by PHP. .shtml is the
main .php extension:

Just wanted to clarify what I meant in this paragraph since this has
been a long thread.

“…because any file that isn’t a graphic or .js or .css is processed by
PHP. That means, .shtml, .php, or any extensionless file like
galleries, cr, polls, etc.
[see list at bottom of fastcgi.conf in my
last message] .shtml is the main .php extension”


#19

On Mon, Mar 10, 2008 at 01:31:49PM -0400, Ian M. Evans wrote:

root /usr/local/apache/htdocs;

In my nginx.conf, do I need to explicitly state these extensionless PHP
files (there’s 10 of them) in a location directive(s) and have that
passed to the fastcgi?

Yes,

location /galleries/ {
    fastcgi stuff
}

or you may use single regex:

location ^/(galleries|poll|news)/ {
    fastcgi stuff
}

#20

On Mon, Mar 10, 2008 at 02:04:23PM -0400, Ian M. Evans wrote:

   fastcgi stuff

}

Unfortunately, I tried both formats and it tosses a 404 when I try and
go to one of the extensionless files.

First, I forgot ‘~’:

  • location ^/(galleries|poll|news)/ {
  • location ~ ^/(galleries|poll|news)/ {

Second, what’s about galleries/etc root ?
The order should not have meaning for these 3 locaitons.