Possible bug in nginx or upstream fair module?

Hi,

I am seeing some strange behavior with nginx 0.6.32, 0.6.35 and 0.6.36
with
upstream fair load balancer and the patches from [1] for statistics. On
[2]
you can see the statistics taken from nginx via the stub status module.
The
bug that I am seeing results in the increase of the values ( everyting
above
1k ), it seems that at some points the counters just go up and I need to
restart nginx in order to get correct values. I do not have a way to
reproduce it, it just happens at some point (might be a day, might be a
week
or more after nginx was started).

I am also seeing jumps in the statistics for connections to upstreams
(from
the patches in [2]). The counters for upstreams can be reset with
sending HUP
to nginx, but the counters from stub status not. The increase in these
values
results in nginx marking the upstream as down and “no live upstreams
while
connecting to upstream” in the error log. In the nginx conf I have an
upstream with 10 php backends using fair in peak mode.

Is it a nginx bug or does it come from the upstream fair module? Is
somebody
else seeing similar issues with/without the fair module?

1: 'Re: Is it possible to monitor the fair proxy balancer?' - MARC
2: http://bgrouter.xaxo.eu/~space/graph_image.png

Thanks,
Momchil

Momchil I. wrote:

Hi,

I am seeing some strange behavior with nginx 0.6.32, 0.6.35 and 0.6.36
with
upstream fair load balancer and the patches from [1] for statistics. On
[2]
you can see the statistics taken from nginx via the stub status module.
The
bug that I am seeing results in the increase of the values ( everyting
above
1k ), it seems that at some points the counters just go up and I need to
restart nginx in order to get correct values. I do not have a way to
reproduce it, it just happens at some point (might be a day, might be a
week
or more after nginx was started).

I am also seeing jumps in the statistics for connections to upstreams
(from
the patches in [2]). The counters for upstreams can be reset with
sending HUP
to nginx, but the counters from stub status not. The increase in these
values
results in nginx marking the upstream as down and “no live upstreams
while
connecting to upstream” in the error log. In the nginx conf I have an
upstream with 10 php backends using fair in peak mode.

I’ve been seeing this same behaviour with nginx 0.6.35 and the latest
github version of the fair balancer patch. I don’t use any of the stats
stuff however.

Turning off ‘fair’ from the upstream block seems to have fixed it, but
ofcourse gone back to the suboptimal balancing.

Have you guys checked out the ey load-balancer? It seems to have some
promising results

http://github.com/ry/nginx-ey-balancer/tree/master