Nginx load balancing Tomcat servers

Hello,

What is the right way to setup a true load balancing of a few Tomcat
servers?
By word “true” I understand taking into account at least CPU load of
those
nodes, not just round robining.
I’ve looked at 3rd party modules, but doesn’t look there is the one.

Thank you,
Ruslan

01 марта 2012, 02:39 от Ruslan D. : > > What is the right way to
setup a true load balancing of a few Tomcat > servers? > By word “true”
I understand taking into account at least CPU load of those > nodes, not
just round robining. > I’ve looked at 3rd party modules, but doesn’t
look there is the one. Hello, You may want to try my Lua based load
balancer, it picks the least loaded backend server based on the latest
load averages that it collects by polling backend servers using parallel
subrequests. It’s not optimized because I use it mainly for
benchmarking, but if you spend some time tuning the parameters, you
should be able to achieve better than standard load balancing
performance. The following frontend server load balancer requires
Agentzh’s Lua module: Lua | NGINX # # NGiNXYZ
Load Balancer - Frontend Server # Copyright (C) Max # http { # Set up a
cache zone for storing backend server load averages. proxy_cache_path
/var/tmp/nginx/load_average_cache levels=1
keys_zone=load_average_cache:64k inactive=5s; server { location / { set
$backend_server “”; rewrite_by_lua ’ --[[ Only for debugging. local
start_time = ngx.now() --]] --[[ Replace these entries with your backend
servers. ]] local backend_servers = { “192.168.2.1:8080”,
“192.168.2.2:8080”, “192.168.2.3:8080” } local fallback_backend_server =
“192.168.2.1:8080” local subrequests = {} for i, server in
ipairs(backend_servers) do table.insert(subrequests, {
“/poll_load_average/”…server, { method = ngx.HTTP_GET } }) end local
responses = { ngx.location.capture_multi(subrequests) } local
min_load_average = 1024 local min_index = 0 for i, response in
ipairs(responses) do local body_length = #response.body if (body_length

0 and body_length < 8) then local load_average =
tonumber(response.body) --[[ Only for debugging. ngx.log(ngx.DEBUG,
"backend server ", backend_servers[i], " load average: ", load_average)
–]] if (load_average and load_average < min_load_average) then
min_load_average = load_average min_index = i end end end if (min_index
0) then ngx.var.backend_server = backend_servers[min_index] else
ngx.var.backend_server = fallback_backend_server; end --[[ Only for
debugging. ngx.log(ngx.DEBUG, "least loaded backend server ",
ngx.var.backend_server) local total_time = ngx.now() - start_time
ngx.log(ngx.DEBUG, "poll_load_average subrequest completion time: ",
total_time) --]] '; # The least loaded backend server has been
determined and # assigned to the $backend_server variable inside the #
rewrite_by_lua block. proxy_pass http://$backend_server; } location ~
^/poll_load_average/(.+)$ { # This is the subrequest location block used
by the # rewrite_by_lua block to poll backend servers for # the load
average. internal; proxy_pass_request_headers off;
proxy_pass_request_body off; # Prevent subrequests from taking more time
than we allow # them to. If a backend server is too slow to respond, it

will be skipped automatically. proxy_connect_timeout 30ms;

proxy_send_timeout 20ms; proxy_read_timeout 60ms; # Do not cache load
averages and do not read them from cache #proxy_no_cache 1;
#proxy_cache_bypass 1; # Cache load averages and read them from cache
proxy_cache load_average_cache; # $1 = backend_server:port
proxy_cache_key $1; # Poll the backend server for its latest load
average. proxy_pass http://$1/load_average/; } } } You should first
replace the entries in the backend_servers array and the
fallback_backend_server string with your backend server address:port
values. Then you may want to turn debugging on by adding “]]” at the end
of every "–[[ Only for debugging. " line in order to see what the Lua
code does: 1) A subrequest for the /poll_average_load/ location is
created for each of the backend servers. 2) All subrequests are
initiated and executed simultaneously in parallel, so the total time /
latency equals the completion time of the slowest request.
ngx.location.capture_multi() waits for all the subrequests to complete
and captures their responses (status, header and body). In order to
prevent a slow subrequest / backend server from delaying the execution
of the original request, proxy_timeout directives are used in the
/poll_load_average/ location to skip any backend servers that are too
slow to respond in time. 3) Subrequest response bodies are validated
(they may contain only the load average and must be shorter than 8
characters), and the backend server with the lowest load average is then
assigned to the $backend_server nginx variable that had been created
before the rewrite_by_lua block. If none of the backend servers return a
valid load average, the backend you assigned to fallback_backend_server
is used. 4) The original request is passed on to the least loaded
backend server by using the proxy_pass directive. Note that caching can
be used inside the /poll_load_average/ subrequest location block to
reduce the overhead of constantly polling all of the backend servers,
but load averages should typically not be cached for more than a few
seconds. Cache expiration time for load averages should be set
individually on each of the backend servers in order to optimize
performance. You should also experiment with the proxy
_timeout values
to find optimal values for your setup. The matching backend server load
balancer configuration for this setup should be configured like this: #

NGiNXYZ Load Balancer - Backend Server # Copyright (C) Max # http {

Set up a limit request zone to prevent too frequent # load average
requests. limit_req_zone $binary_remote_addr
zone=load_average_limit_request_zone:256k rate=10r/s; # Extract the last
minute load average from the /usr/bin/uptime # output and assign it to
$load_average. # # The first alarm call sets up a 2 second timer and the
2nd call # cancels it. If the Perl subroutine is not completed within #
2 seconds, a return is forced to prevent Perl from holding up # further
processing. perl_set $load_average ’ sub { eval { local $SIG{ALRM} = sub
{ return “”; }; alarm 2; /usr/bin/uptime =~ /averages: (\d+.\d+)/;
alarm 0; return $1; } } '; server { location /load_average/ { # Prevent
this load average from being cached for more # than 2 seconds by the
frontend server. expires 2s; # Prevent too frequent load average
requests. limit_req_log_level error; limit_req
zone=load_average_limit_request_zone nodelay; # Return $load_average
using Agentzh’s Echo module echo $load_average; # Note that using
“return 200 $load_average;” instead of # “echo $load_average;” to return
$load_average makes # any limit_request directive useless inside this #
location block because it prevents the limit_request # triggered error
code 503 from being returned. #return 200 $load_average; } } } The
backend servers would require the following additional modules for this
configuration to work: Embedded Perl module
http://wiki.nginx.org/EmbeddedPerlModule Agentzh’s Echo module (optional

  • see the “return 200” comment above)
    HTTP Echo Module | NGINX My comments are extensive, so
    there’s nothing to add, except that this could be done using Apache and
    a CGI script that does something like this to extract the load average
    from the uptime output: uptime | awk -F’: |,’ ‘{print $5}’ You could
    also set up a shell script based netcat server to answer load average
    requests, but you’d have to set up additional firewall rules to prevent
    too frequent requests from degrading performance. You could also extend
    the load balancer to include other performance criteria, such as iostat,
    netstat values etc. The Stub Status module statistics could also be
    included: Module ngx_http_stub_status_module Note that if you
    decide to cache load averages on the frontend server, you should always
    make sure that your backend servers always set an appropriate expiration
    time (by using the “Cache-Control: max-age=$x” header, for example) in
    the response (typically, it should be no longer than a few seconds, but
    that really depends on your setup). I hope that helps, any benchmarks
    and comments will be appreciated. Max

Apologies, mail.ru mangles my posts sometimes, unfortunately
there is no option to turn HTML off.

01 марта 2012, 02:39 от Ruslan D. [email protected]:

What is the right way to setup a true load balancing of a few Tomcat
servers?
By word “true” I understand taking into account at least CPU load of those
nodes, not just round robining.
I’ve looked at 3rd party modules, but doesn’t look there is the one.

Hello,

You may want to try my Lua based load balancer, it picks the least
loaded
backend server based on the latest load averages that it collects by
polling backend servers using parallel subrequests. It’s not optimized
because I use it mainly for benchmarking, but if you spend some time
tuning the parameters, you should be able to achieve better than
standard load balancing performance.

The following frontend server load balancer requires Agentzh’s Lua
module:

NGiNXYZ Load Balancer - Frontend Server

Copyright (C) Max <nginxyz (at) users dot berlios dot de>

http {

# Set up a cache zone for storing backend server load averages.
proxy_cache_path /var/tmp/nginx/load_average_cache
                 levels=1
                 keys_zone=load_average_cache:64k inactive=5s;

server {

    location / {

        set $backend_server "";

        rewrite_by_lua '

            --[[ Only for debugging.
            local start_time = ngx.now()
            --]]

            --[[ Replace these entries with your backend servers. ]]
            local backend_servers = { "192.168.2.1:8080",
                                      "192.168.2.2:8080",
                                      "192.168.2.3:8080"  }

            local fallback_backend_server = "192.168.2.1:8080"

            local subrequests = {}

            for i, server in ipairs(backend_servers) do
                table.insert(subrequests,
                             { "/poll_load_average/"..server,
                               { method = ngx.HTTP_GET } })
            end

            local responses = { 

ngx.location.capture_multi(subrequests) }

            local min_load_average = 1024
            local min_index = 0
            for i, response in ipairs(responses) do

                local body_length = #response.body
                if (body_length > 0 and body_length < 8) then
                    local load_average = tonumber(response.body)

                    --[[ Only for debugging.
                    ngx.log(ngx.DEBUG, "backend server ",
                                       backend_servers[i],
                                       " load average: ", 

load_average)
–]]

                    if (load_average
                        and load_average < min_load_average) then

                        min_load_average = load_average
                        min_index = i
                    end
                end
            end

            if (min_index > 0) then
                ngx.var.backend_server = backend_servers[min_index]
            else
                ngx.var.backend_server = fallback_backend_server;
            end

            --[[ Only for debugging.
            ngx.log(ngx.DEBUG, "least loaded backend server ",
                               ngx.var.backend_server)

            local total_time = ngx.now() - start_time

            ngx.log(ngx.DEBUG,
                    "poll_load_average subrequest completion time: 

",
total_time)
–]]
';

        # The least loaded backend server has been determined and
        # assigned to the $backend_server variable inside the
        # rewrite_by_lua block.

        proxy_pass http://$backend_server;
    }


    location ~ ^/poll_load_average/(.+)$ {

        # This is the subrequest location block used by the
        # rewrite_by_lua block to poll backend servers for
        # the load average.

        internal;

        proxy_pass_request_headers off;
        proxy_pass_request_body    off;

        # Prevent subrequests from taking more time than we allow
        # them to. If a backend server is too slow to respond, it
        # will be skipped automatically.
        proxy_connect_timeout 30ms;
        proxy_send_timeout    20ms;
        proxy_read_timeout    60ms;

        # Do not cache load averages and do not read them from cache
        #proxy_no_cache     1;
        #proxy_cache_bypass 1;

        # Cache load averages and read them from cache
        proxy_cache     load_average_cache;
        # $1 = backend_server:port
        proxy_cache_key $1;

        # Poll the backend server for its latest load average.
        proxy_pass http://$1/load_average/;
    }
}

}

You should first replace the entries in the backend_servers array
and the fallback_backend_server string with your backend server
address:port values.

Then you may want to turn debugging on by adding “]]” at the end of
every "–[[ Only for debugging. " line in order to see what the
Lua code does:

  1. A subrequest for the /poll_average_load/ location is
    created for each of the backend servers.

  2. All subrequests are initiated and executed simultaneously in
    parallel, so the total time / latency equals the completion
    time of the slowest request. ngx.location.capture_multi()
    waits for all the subrequests to complete and captures
    their responses (status, header and body). In order to
    prevent a slow subrequest / backend server from delaying
    the execution of the original request, proxy_*_timeout
    directives are used in the /poll_load_average/ location to
    skip any backend servers that are too slow to respond in time.

  3. Subrequest response bodies are validated (they may contain
    only the load average and must be shorter than 8 characters),
    and the backend server with the lowest load average is then
    assigned to the $backend_server nginx variable that had been
    created before the rewrite_by_lua block. If none of the backend
    servers return a valid load average, the backend you assigned to
    fallback_backend_server is used.

  4. The original request is passed on to the least loaded backend
    server by using the proxy_pass directive.

Note that caching can be used inside the /poll_load_average/
subrequest location block to reduce the overhead of constantly
polling all of the backend servers, but load averages should
typically not be cached for more than a few seconds. Cache
expiration time for load averages should be set individually
on each of the backend servers in order to optimize performance.

You should also experiment with the proxy_*_timeout values to
find optimal values for your setup.

The matching backend server load balancer configuration for this
setup should be configured like this:

NGiNXYZ Load Balancer - Backend Server

Copyright (C) Max <nginxyz (at) users dot berlios dot de>

http {

# Set up a limit request zone to prevent too frequent
# load average requests.
limit_req_zone $binary_remote_addr
               zone=load_average_limit_request_zone:256k rate=10r/s;

# Extract the last minute load average from the /usr/bin/uptime
# output and assign it to $load_average.
#
# The first alarm call sets up a 2 second timer and the 2nd call
# cancels it. If the Perl subroutine is not completed within
# 2 seconds, a return is forced to prevent Perl from holding up
# further processing.
perl_set $load_average '
    sub {
        eval {
            local $SIG{ALRM} = sub { return ""; };
            alarm 2;
            `/usr/bin/uptime` =~ /averages: (\d+\.\d+)/;
            alarm 0;
            return $1;
        }
    }
';


server {

    location /load_average/ {

        # Prevent this load average from being cached for more
        # than 2 seconds by the frontend server.
        expires 2s;

        # Prevent too frequent load average requests.
        limit_req_log_level error;
        limit_req zone=load_average_limit_request_zone nodelay;

        # Return $load_average using Agentzh's Echo module
        echo $load_average;

        # Note that using "return 200 $load_average;" instead of
        # "echo $load_average;" to return $load_average makes
        # any limit_request directive useless inside this
        # location block because it prevents the limit_request
        # triggered error code 503 from being returned.
        #return 200 $load_average;
    }
}

}

The backend servers would require the following additional modules
for this configuration to work:

Embedded Perl module
http://wiki.nginx.org/EmbeddedPerlModule

Agentzh’s Echo module (optional - see the “return 200” comment above)

My comments are extensive, so there’s nothing to add, except that
this could be done using Apache and a CGI script that does something
like this to extract the load average from the uptime output:

uptime | awk -F’: |,’ ‘{print $5}’

You could also set up a shell script based netcat server to answer
load average requests, but you’d have to set up additional firewall
rules to prevent too frequent requests from degrading performance.

You could also extend the load balancer to include other performance
criteria, such as iostat, netstat values etc.

The Stub Status module statistics could also be included:
http://wiki.nginx.org/HttpStubStatusModule

Note that if you decide to cache load averages on the frontend
server, you should always make sure that your backend
servers always set an appropriate expiration time (by using
the “Cache-Control: max-age=$x” header, for example) in the
response (typically, it should be no longer than a few
seconds, but that really depends on your setup).

I hope that helps, any benchmarks and comments will be appreciated.

Max