True load balancer

Almir_K · July 23, 2008, 8:58pm

i’m kinda interested in implementing a true load balancer functionality
in nginx, my idea is to extend the upstream module so that it would be
able to dynamically modify the configuration of servers (such as mark
them down or change the weight) i see two possible ways:

controlling socket over which you would feed the commands
UPSTREAM method (similar to ncache’s PURGE), the advantages of this
being easier control of configuration file and no need for a dedicated
thread to listen on the socket

the idea is to implement just the controling protocol in nginx, it would
require some controling daemon/script to actually do anything useful.

thoughts? advices?

also any chance of this kind of code making it to official nginx?

Almir_K · July 23, 2008, 9:29pm

Search the mailinglist archive, there where extensive discussion about
the different possible approaches and their problems. The discussions
circulated around monitoring, event handling and such.

Just a hint no advice

with regards,

__Janko

Am 23.07.2008 um 20:48 schrieb Almir K.:

Almir_K · July 23, 2008, 11:44pm

for this to happen i believe

a) there needs to be a way to dynamically add/remove upstreams (or at
least mark them active/inactive) from an external call to nginx daemon

b) monitoring has to be there, which could be built in, or could be an
external process. personally i’d be happy with an external monitoring
thing like ldirectord (some sort of healthchecking script) that upon
noticing a server being down tells nginx “stop this upstream” etc.

Almir_K · July 24, 2008, 1:04am

but if you store the upstreams in memory(very simple structure) after
the nginx restart, and remove the dead one(might be dead for some
reason)…

monitoring should be configurable, what parameter and what value…

Almir_K · July 24, 2008, 1:10am

yeah you could have an

include ‘upstreams.conf’;

and some external process monitor that file, and then send a HUP or
whatever the appropriate process signal to nginx is, but that seems a
little messy. it would be nice if nginx had support built-in for the
most graceful recovery/best user experience when an upstream dies
(maybe it already does, i am just speculating)

Almir_K · July 24, 2008, 1:12am

sorry not “monitor that file” but MANAGE that file.

as in, keep a list of upstreams, and update the file, send HUP/etc to
nginx after it’s updated

Almir_K · July 24, 2008, 7:19am

how about store the backend server in memchached.
so 3rdhparty tool can update the server status for nginx.
But I don’t know whether the performance is enough.

2008/7/24, mike [email protected]:

Almir_K · July 24, 2008, 8:50am

Currently, im going on the following assumptions:

1. Have a HTTP command for reporting load of backend servers
2. Have a HTTP command for moving a backend OFFLINE (for testing of load balancing mostly, but i can see this beeing useful for maintainance)
3. Have a HTTP command for moving a backend back ONLINE

A command for adding backends dynamically seems illogical (remember, this is all temporary data which is lost with a restart/reload). If you want to add a new backend, add it in the config.

Does anyone (except maybe google or something) have such a dynamic pool, that you would want to add new servers dynamically on run-time? Anyone need this?

4. Have a HTTP command for getting weight statistics (ie, load on all backend servers, weights on all backend servers, other) for monitoring (munin, ganglia, etc)

I don't exactly know how we will invoke the real load balancer, but i suppose it will be with the 'real;' keyword, like the fair load balancing

upstream wwwbackend {
    real;
    server earth     weight=20;
    server wind   weight=20;
    server water    weight=20;
    server fire      weight=20;
}

Weights take a default value, but will be adjusted depending on the load responses from backend servers.

Discuss.

Delta Y. wrote:

how about store the backend server in memchached.

so 3rdhparty tool can update the server status for nginx.

But I don't know whether the performance is enough.

2008/7/24, mike <[email protected]>:
sorry not "monitor that file" but MANAGE that file.

as in, keep a list of upstreams, and update the file, send HUP/etc to
nginx after it's updated

On 7/23/08, mike <[email protected]> wrote:
> yeah you could have an
>
> include 'upstreams.conf';
>
> and some external process monitor that file, and then send a HUP or
> whatever the appropriate process signal to nginx is, but that seems a
> little messy. it would be nice if nginx had support built-in for the
> most graceful recovery/best user experience when an upstream dies
> (maybe it already does, i am just speculating)
>
>
> On 7/23/08, Istvan Szukacs <[email protected]> wrote:
> > but if you store the upstreams in memory(very simple structure) after the
> > nginx restart, and remove the dead one(might be dead for some reason)...
> >
> > monitoring should be configurable, what parameter and what value...
> >
> >
> > mike wrote:
> > > for this to happen i believe
> > >
> > > a) there needs to be a way to dynamically add/remove upstreams (or at
> > > least mark them active/inactive) from an external call to nginx daemon
> > >
> > > b) monitoring has to be there, which could be built in, or could be an
> > > external process. personally i'd be happy with an external monitoring
> > > thing like ldirectord (some sort of healthchecking script) that upon
> > > noticing a server being down tells nginx "stop this upstream" etc.
> > >
> > >
> > >
> > > On 7/23/08, Almir K. <[email protected]> wrote:
> > >
> > >
> > > > i'm kinda interested in implementing a true load balancer functionality
> > > > in nginx, my idea is to extend the upstream module so that it would be
> > > > able to dynamically modify the configuration of servers (such as mark
> > > > them down or change the weight) i see two possible ways:
> > > >
> > > > - controlling socket over which you would feed the commands
> > > > - UPSTREAM method (similar to ncache's PURGE), the advantages of this
> > > > being easier control of configuration file and no need for a dedicated
> > > > thread to listen on the socket
> > > >
> > > >
> > > >
> > > > the idea is to implement just the controling protocol in nginx, it would
> > > > require some controling daemon/script to actually do anything useful.
> > > >
> > > >
> > > > thoughts? advices?
> > > >
> > > > also any chance of this kind of code making it to official nginx?
> > > >
> > > > --
> > > > vi vi vi -- the number fo the beast
> > > >
> > > >
> > > >
> > > >
> > >
> > >
> > >
> >
> >
> >
>

Almir_K · July 24, 2008, 9:54am

On 7/23/08, Tit P. [email protected] wrote:

Currently, im going on the following assumptions:

Have a HTTP command for reporting load of backend servers

Have a HTTP command for moving a backend OFFLINE (for testing of load
balancing mostly, but i can see this beeing useful for maintainance)

Have a HTTP command for moving a backend back ONLINE

yes, having it be HTTP commands with a configurable user/password
using http auth would be perfect and should minimize the amount of
hacking required for that piece since nginx is already listening on
the HTTP stream…

A command for adding backends dynamically seems illogical (remember, this is
all temporary data which is lost with a restart/reload). If you want to add
a new backend, add it in the config.

This would be good enough for me.

Have a HTTP command for getting weight statistics (ie, load on all
backend servers, weights on all backend servers, other) for monitoring
(munin, ganglia, etc)

Reporting would be great… knowing the number of connections
currently active, maybe some average statistics, oh yeah - per Host:
header bandwidth stats would be great for me personally (wishlist…)

I don’t exactly know how we will invoke the real load balancer, but i
suppose it will be with the ‘real;’ keyword, like the fair load balancing

upstream wwwbackend {
real;
server earth weight=20;
server wind weight=20;
server water weight=20;
server fire weight=20;
}

What is the difference between “real” and current? Not sure I get it.
Currently with nginx it will reproxy if it times out on one upstream
which is awesome, so it already has more features than normal load
balancers… it just doesn’t have the capability to do healthchecking
or work with a healthchecking system right now. Hopefully we can
figure out a solution where people decide how they want to do their
own healthchecking and simply launch
http://localhost/nginx-config/earth/up or
http://localhost/nginx-config/earth/down or some other very simple
HTTP-request based thing. query strings or RESTful style… please
nothing heavy and stupid like SOAP

Almir_K · July 24, 2008, 10:19am

mike wrote:

yes, having it be HTTP commands with a configurable user/password
using http auth would be perfect and should minimize the amount of
hacking required for that piece since nginx is already listening on
the HTTP stream…

We agree, i wonder if we can use the existing access control (IP address
and HTTP basic authentication) for this. If we can, super!

I don’t exactly know how we will invoke the real load balancer, but i

nothing heavy and stupid like SOAP
Current (ie, normal weighted) doesn’t take machine load into account, it
just balances with the weights. Neither does fair, but it tries to
approximate it. Real would collect load data, and adjust the weights
accordingly, so the load will be evened out trough all servers.

You are correct in assuming or suggesting what you did, list items 2)
and 3) above state just that, a way of taking a backend offline or back
online from a previous offline command. This is in addition to the
internal offline check (if the server becomes unreachable, nginx handles
that already). We’re not considering anything complex, but we will have
a new HTTP method, like PURGE with ncache, so you won’t be able to fuck
things up with a normal browser (IE, Firefox).

Would you preffer to have GET requests for this? I guess we can set a
location in the config (like /server-status, so each user can choose
their own location for interfacing the load balancer)? I see some
benefit in this, but mostly for statistics&status, I still think that
load reporting from backend servers and taking a machine offline &
online, needs to be a special HTTP command type.

Almir_K · July 24, 2008, 11:12am

On 7/24/08, Tit P. [email protected] wrote:

Would you preffer to have GET requests for this? I guess we can set a
location in the config (like /server-status, so each user can choose their
own location for interfacing the load balancer)? I see some benefit in this,
but mostly for statistics&status, I still think that load reporting from
backend servers and taking a machine offline & online, needs to be a special
HTTP command type.

for “security by obscurity” i would make the configuration url prefix
something configurable, as also the security methods (user/password,
or even just piggyback it on auth_basic for all i care), or use IP
addresses… i guess Igor ultimately could make the call since he’s
designed nginx so well, i consider his opinion to be well-thought out
typically.

as far as GET vs PURGE etc… i always like sticking to standard HTTP
verbs (PUT, GET, POST, DELETE, etc) - i don’t care as long as i can
call it from shell/perl/php/etc. scripts using normal HTTP method
calls or even just port 80 direct plaintext connection