TCP connection problems (FIN_WAIT2/LAST_ACK) when upstream server is keepalived

Have a strange problem. We have a configuration (this is only a part of
the configuration) that has problems if the upstream server is a
load-balanced server using ipvs/keepalived.

Here is the relevant part of the configuration:

upstream fb_server {
server foobar:8080;
}
server {
listen 80;
server_name f.b.com;
location /fb/ {
proxy_pass http://fb_server;
}
}

The hostname “foobar” resolves to an IP address that is load-balanced
over 2 physical servers (“foo” and “bar”) using ipvs/keepalived.

What we see is that on the 2 upstream servers “foo” and “bar” there are
a lot of connections in FIN_WAIT2 state and on the machine where nginx
is running we see an increasing number of connections in LAST_ACK state.
This would imply that nginx is not managing the upstream connection
shutdown properly.

In order to prove that the problem is related to the load-balancing, if
we change the upstream server to either “foo” or “bar” the problem goes
away.

Now I realize that we can also use nginx to do the load-balancing and
not use ipvs/keepalived, but I’d like to know why this doesn’t work
properly in this configuration. It is also probably a useful thing for
the nginx authors and users to know as well.

If anyone has any idea why this is behaving like this I’d love to hear
from you.

Posted at Nginx Forum:

This isn’t a problem with nginx, it is likely a configuration issue with
your ipvs software. Removing ipvs from the configuration shows that. You
might want to look at how your LB is timing out tcp sessions. In your
case
it shows that the connection close (fin) from your nginx server isn’t
making
it to your realservers “foo” and “bar”. Capture a http request at all 3
points in the chain to determine what is happening.

Sridhar

Yes, it turns out that it isn’t a problem with nginx, it is a problem
with the iptables configuration on the load-balancer. It isn’t a timeout
problem though. The iptables configuration was causing the FIN packets
to be dropped so that they weren’t passed on to the realservers. This
post explains the problem (and possible solution) nicely:

http://www.mail-archive.com/[email protected]/msg02103.html

-DWass

sridhar basam Wrote:

points in the chain to determine what is

load-balanced server using ipvs/keepalived.
proxy_pass http://fb_server;
a lot of connections in FIN_WAIT2 state and on
“bar” the problem goes
If anyone has any idea why this is behaving like
this I’d love to hear
from you.

Posted at Nginx Forum: