Fastcgi causing 502 errors?

Nuno_MagalhSSSSes · September 3, 2009, 9:43pm

I’m running nginx/0.8.10 and PHP 5.2.10 on Debian Sid with a
2.6.30-1-amd64 kernel. This is my desktop machine, mainly for testing
and screwing around with configurations (although screwing up also
comes to mind). I’ve recently upgraded my system through Debian’s
apt-get dist-upgrade, which reset my fastcgi_params.

Nginx spawns two worker processes, php-fastcgi spawns 4, nginx is the
only http server i’m running, i have no proxies or anything.

If i have an URL like http://localhost:8080/site01/file.php?var=param
and hit the reload button constantly, i’ll get a bunch of normal
replies (200), followed by a bunch of 502s, then 200, then 502, …
The interval is about the same, maybe a second or so.

If i use http://localhost:8080/simplefile.html for a simple lorem
ipsum html file i sometimes get the 502 in the first few reloads, but
after that it’s just 200.

While doing this, my CPU load doesn’t go above ~70% and i’ve yet to
fully use the first have of its 4GiB RAM. I’ve stripped my nginx.conf
to the bare minimums, commenting out most stuff (maybe too much), and
i can’t get around to it. The access log reports 200s and 502s (and i
have a 502 page that gets sent), the error log shows this:

2009/09/03 19:45:16 [error] 7402#0: *318 recv() failed (104:
Connection reset by peer) while reading response header from upstream,
client: 127.0.0.1, server: localhost, request: “GET
/site01/file.php?var=value HTTP/1.1”, upstream:
“fastcgi://127.0.0.1:9000”, host: “localhost:8080”, referrer:
“http://localhost:8080/site01/file.php?var=value”

The same error over and over. This issue is similar to this[1] one,
but the latter got corrected in 0.8.8. I got two logs, one for a 200
and another for a 502, attached. the error starts around line 143. My
nginx.conf and server{}s are in the other post [2].

I don’t know how to fix it, any suggestions are most welcome.

TIA,
Nuno MagalhÃ£es

[1] Re: upstream split a header line in FastCGI records
[2] alternating 404 and 200

Nuno_MagalhSSSSes · September 3, 2009, 9:59pm

2009/9/4 Nuno MagalhÃ£es [email protected]

and hit the reload button constantly, i’ll get a bunch of normal
replies (200), followed by a bunch of 502s, then 200, then 502, …
The interval is about the same, maybe a second or so.

If i use http://localhost:8080/simplefile.html for a simple lorem
ipsum html file i sometimes get the 502 in the first few reloads, but
after that it’s just 200.

how heavy is your ‘/site01/file.php’ script? (btw, I dont see any
reference
to it in your 502 log file)

I would imagine that the cause would be because nginx does not queue
connections. And you have only a limited number of fastcgi processes (or
how
are u running your fastcgi?) Therefore when they are busy, and cannot
accept
connections, you will get a reset (i dont know how u’re setting up your
fastcgi, but!). The html without php gets cached very easily - therefore
it’s 200s all the way after any initial problems.

-jf

–
In the meantime, here is your PSA:
“It’s so hard to write a graphics driver that open-sourcing it would not
help.”
– Andrew Fear, Software Product Manager, NVIDIA Corporation

Nuno_MagalhSSSSes · September 4, 2009, 12:16am

how heavy is your ‘/site01/file.php’ script? (btw, I dont see any reference
to it in your 502 log file)

The log’s of site02. None of the sites is big or complex and none of
the examples i’ve used even accesses databases (although both sites
use a DB).

I would imagine that the cause would be because nginx does not queue
connections. And you have only a limited number of fastcgi processes (or how
are u running your fastcgi?) Therefore when they are busy, and cannot accept
connections, you will get a reset (i dont know how u’re setting up your
fastcgi, but!). The html without php gets cached very easily - therefore
it’s 200s all the way after any initial problems.

FastCGI runs as a daemon, i use the Debian php-cgi[1] package. It
spawns the master process plus 4 others. The start/stop script is the
one that comes with the package. This is in /etc/default/php-fastcgi:
PHP_FCGI_CHILDREN=4
PHP_FCGI_MAX_REQUESTS=1000

What’s the time period for those requests? 1000 per second? Per child
simultaneously? I currently have two server{ }s included in
nginx.conf, and put my fast-cgi daemon running on 8888.

One handles 3 sites as /sitexx/ subfolders:

server {
server_name localhost;
listen 127.0.0.1:8080;
access_log /var/log/nginx/localhost.access.log vpt;
error_log /var/log/nginx/localhost.error.log;
charset utf-8;

    location / {
            root   /var/www/nginx-default;
            index  index.php index.html;
    }
    location = /favicon.ico {
            return 204;
    }
    error_page  404  /40x.html;
    location = /40x.html {
            root   /var/www/nginx-default;
    }
    error_page  403  /403.html;
    location = /403.html {
            root   /var/www/nginx-default;
    }
    error_page   500 502 503 504 /50x.html;
    location = /50x.html {
            root   /var/www/nginx-default;
    }
    location ~ \.(php|html)$ {
            fastcgi_pass   localhost:8888;
            fastcgi_index  index.php;
            fastcgi_param  SCRIPT_FILENAME

/var/www/nginx-default$fastcgi_script_name;
fastcgi_param DOCUMENT_ROOT /var/www/nginx-default;
fastcgi_intercept_errors on;
include fastcgi_params;
}
}

One is a one-page form. No errors
One is the one with a php menu (.php?menu=option then loads content
accordingly), uses sessions and form-validation, includes
session-related stuff and some http headers. 502 errors
One is a simple site with php-menu.

After testing them all, this last site doesn’t 502 on me despite
having a php-menu like the one that uses sessions…

I ‘forked’ one of the sites (the fourth) to another server{ } running
on a different port, that way it resembles more the production server
(which uses domains, not subdirectories of course):

server {
server_name localhost;
listen 127.0.0.1:8081;
access_log /var/log/nginx/reagentes.access.log vpt;
error_log /var/log/nginx/reagentes.error.log debug;
charset utf-8;

    location / {
            root   /var/www/nginx-default/reagentes;
            index  index.html;
            try_files $uri $uri/ $uri/index.html;
    }
    location = /favicon.ico {
            return 204;
    }
    error_page  404  /40x.html;
    location = /40x.html {
            root   /var/www/nginx-default/reagentes;
    }
    error_page  403  /403.html;
    location = /403.html {
            root   /var/www/nginx-default/reagentes;
    }
    error_page   500 502 503 504 /50x.html;
    location = /50x.html {
            root   /var/www/nginx-default/reagentes;
    }
    location ~ \.(php|html)$ {
            fastcgi_pass   localhost:8888;
            fastcgi_index  index.php;
            fastcgi_param  SCRIPT_FILENAME

/var/www/nginx-default/reagentes$fastcgi_script_name;
fastcgi_param DOCUMENT_ROOT
/var/www/nginx-default/reagentes;
fastcgi_intercept_errors on;
include fastcgi_params;
}
}

This fourth site (the site02 the log refers) at the moment is nothing
more than a simple page with some http headers, session stuff and a
login form with validation. I got it off the other site, and it 502s.
What’s getting on my nerves is that i haven’t changed these sites in
the last months, all i did was add a fourth site and only then i
noticed these issues. All the 3 sites were working great (or maybe i’m
going crazy), so i have… had no reason to doubt the code.

They had other options but i commented them out. They’re very similar,
is there a way i can use something like $site=‘site02’ and then use
location / {
root /var/www/nginx-default/$site;
instead? That would be nifty. (And i guess i could include other
locations from file as well, like the error handling or any common
stuff.)

I also haven’t touched the production server (calling it “production”
is overrated, none of these sites are ready to the public, but i
neither changed the code nor the system), and it’s churning along
fine. I’m even afraid to change anything and surely won’t be upgrading
it anytime soon. It’s running Debian Lenny (so stable) with a
2.6.26-2-686 kernel. nginx is 0.7.59 and PHP (only CGI; no CLI) is
5.2.6.

I do believe i need to somehow change my fastcgi settings, i’m just
not sure how or why. I’d like to avoid messing with the sites’ code.
Any suggestions?

TIA,
Nuno MagalhÃ£es

[1] Debian -- Error

Nuno_MagalhSSSSes · September 4, 2009, 1:24pm

hm, I would suggest you take out the ‘502’ from your error_page handling.
You want to see whether the 502 is coming from your fastcgi… or nginx.
Take a look at your php error logs as well.

Before i had a 502 error_page i kept having 200 and 404 (my previous
post). It was only when Igor suggested i should look into that error
log thingy that i discovered the 404 was because i had no 502 page
to handle the error. The real error is thus 502 and the page gets
served.

I’ve changed (and tested) my local php settings so that errors get
outputted to a file instead of the browser, but this situation outputs
no errors to the php log.

Setting fastcgi_intercept_errors to off didn’t create output either
(so i left it on).

Nuno_MagalhSSSSes · September 7, 2009, 4:21pm

Since this seemed to be php-related i’ve installed apache 2.2.12. It
didn’t give me nearly as many errors as nginx but it did send unparsed
pages from time to time (i.e. with both php and html code…). I had
child processes segfaulting in the log all the time and this error:
[error] [client xx.xx.xx.xx] ALERT - canary mismatch on efree() - heap
overflow detected (attacker ‘xx.xx.xx.xx’, file '/home/xxx\

After digging some more i ended up upgrading to apache 2.2.13 and
setting suhosin.session.encrypt = off in apache’s php/suhosin
configuration. This solved the issue.

However, the general /etc/php5/conf.d/suhosin.ini and
/etc/php5/cgi/conf.d/suhosin.ini both already have
suhosin.session.encrypt = off, so i can’t really test this against
nginx+FastCGI. I’ll try and upgrade to 5.3 later (and to nginx 0.8.14)
and see if it helps.

This issue happens with PHP’s start_session() function, and is being
discussed in a few distros[1][2].

HTH whoever’s facing the same issues.
Nuno MagalhÃ£es

[1] Bug #424789 “PHP random segfaults on session_start();” : Bugs : php5 package : Ubuntu
[2] http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=542514

Nuno_MagalhSSSSes · September 8, 2009, 12:19pm

Got it fixed afxer upgrading both php5 and php5-cgi to 5.2.10.

HTH the next guy.

Nuno_MagalhSSSSes · September 4, 2009, 3:27am

2009/9/4 Nuno MagalhÃ£es [email protected]

           fastcgi_param DOCUMENT_ROOT /var/www/nginx-default;
           fastcgi_intercept_errors on;
           include fastcgi_params;
   }

}

hm, I would suggest you take out the ‘502’ from your error_page
handling.
You want to see whether the 502 is coming from your fastcgi… or nginx.
Take a look at your php error logs as well.

-jf