Why use haproxy now?

larry · March 28, 2013, 10:46am

Hi,

I made a lot of reading and comparisons.

But I still cannot understand why, in 2013 and with the latest version
of
nginx, we would still need haproxy in front of it.

Nginx has it all to handle high traffic loads + very good load balancer.

Could someone help/explain what I am missing ? Even Varnish… nginx can
cache too.

Absolutely not a fanboy, just someone trying to understand different
statements and why people want different layers when one seems enough.

Thanks,

Larry

Posted at Nginx Forum:

larry · March 28, 2013, 11:10am

But I still cannot understand why, in 2013 and with the latest version of
nginx, we would still need haproxy in front of it.

You don’t need it is just a thing of preference or needs / that is also
why
we don’t have a single webserver or database server software.

But to name few advantages (at least for me) why I would use (and am
doing
it) haproxy as a balancer - it has more refined backend
status/administrative page ( http://demo.1wt.eu/ (without the admin
features)).
The nginx upstream module is lacking in this area and for now (as far as
I
know) you can only get info only via logging.

You have detailed information of what’s up and what’s down / how many
failures there have been. Also you can easily bring down any backends
without the need to change configuration (in case of nginx would need to
rewrite the config and do a reload).

Even Varnish… nginx can cache too.

As to varnish - I preffer the memory mapped file instead of nginx
approach
of creating a file for each cachable object in filesystem.

rr

larry · March 28, 2013, 11:23am

Hi,
actually in our setup we use NGINX as SSL termination before HAProxy.
HAProxy have some features that Nginx still doesn’t have. Like backend
max
connections and frontend queue. So you can do throtlling to prevent your
backend
server high load and keep request from client in front. So the didn’t
get
HTTP 500.
Another feature is splice system call, which makes HAProxy really fast
with
low system load.

Lukas

On 28 March 2013 11:10, Reinis R. [email protected] wrote:

status/administrative page ( http://demo.1wt.eu/ (without the admin

nginx mailing list
[email protected]

http://mailman.nginx.org/**mailman/listinfo/nginx http://mailman.nginx.org/mailman/listinfo/nginx

–
Lukáš Heřbolt
Linux Administrator

Created by ET NETERA | Powered by jNetPublish

larry · March 28, 2013, 12:22pm

Very simple: features.

haproxy has a huge list of features for reverse proxying that nginx
hasn’t, varnish has the same for caching.

If you can do everything with nginx, go for it. But for more complex
scenarios and if you really need the highest possible performance,
you probably wanna stick to what the particular product does best.

For example: haproxy does tcp splicing, that means the http payload
not even touches the user-space, and the kernel just does a zero
copy. Are you able to forward 20Gbps with nginx on a single machine?
I doubt that.

Lukas

larry · March 28, 2013, 12:23pm

On Thu, Mar 28, 2013 at 7:21 AM, Lukas T. [email protected]
wrote:

Are you able to forward 20Gbps with nginx on a single machine?
I doubt that.

Why would you doubt that? Of course, my machines may be bigger than the
norm…

larry · March 28, 2013, 12:58pm

Why would you doubt that? Of course, my machines may be bigger than the norm…

Because nginx doesn’t do tcp splicing. Is my assumption wrong; are you
able to
forward 20Gbps with nginx? Then yes, probably you have huge hardware,
which isn’t
necessary with haproxy.

larry · March 28, 2013, 12:22pm

I use haproxy for alot of non-HTTP load balancing.

larry · March 28, 2013, 1:29pm

Hi
I actually did some quite in-depth comparison with splice() sys call
(only available on linux btw), between nginx and haproxy, and even wrote
a small standalone proxy server that uses it
There was some improvement, but not on the scale that would make it a
deciding factor
The thing that makes most difference to forwarding is your network card,
and if it supports LRO (large receive offload) - if you’re using a 10G
lan card, it probably has it, anything less probably doesn’t

I’ve attached my results, the test was proxying a file a certain amount
of times, and I would log how much cpu time was used (ab -n 1000 -c 10
192.168.1.101:8001/10MB.zip)

RTL = onboard realtek (they are crap)
INTEL = intel 1000CT ($30 thing)
LIN = Linux (3.6.something)
BSD = FreeBSD 9.0
HA = Haproxy (latest 1.5 dev version at the time)
NGX = Nginx 1.3.something
PS = splice() proxy that I wrote
SPL/BUF/OFF = mode either splice, buffer or off/on (nginx
proxy_buffering)

Afterwards I got some 10G cards to test and it was (by probably 80-90%)
faster at all tests

larry · March 28, 2013, 1:17pm

On Mar 28, 2013, at 3:57 PM, Lukas T. wrote:

Why would you doubt that? Of course, my machines may be bigger than the norm…

Because nginx doesn’t do tcp splicing. Is my assumption wrong; are you able to
forward 20Gbps with nginx? Then yes, probably you have huge hardware, which
isn’t
necessary with haproxy.

Just curious, are you referring to “splice-auto” or just
“splice-response”?

I’d assume “splice-response” sort of disables response buffering and it
might be useful indeed if you’ve got fast clients and fast servers. I
wonder know what happens with slow clients/fast servers tho

Also, to the best of my understanding, both Linux kernel version and
network card present a lot of specifics in regards to how splice is
used.

larry · March 28, 2013, 1:31pm

Also, to the best of my understanding, both Linux kernel version and
network card present a lot of specifics in regards to how
splice is used.

Kernel, yes. The fist splice was implemented in 2.6.17 but it was buggy.
So
it is not recommended to use it.
Reimplementation was done in 3.5 and since that version everything works
fine.
I’m not sure how much it depends on NIC. I assume it would’n be much
difference,
more importatnt is tcp offloading support.

On 28 March 2013 13:16, Andrew A. [email protected] wrote:

nginx mailing list
[email protected]
nginx Info Page

–
Lukáš Heřbolt
Linux Administrator

Created by ET NETERA | Powered by jNetPublish

larry · March 28, 2013, 4:21pm

There is backup servers, least_conn and other fancy things. Isn’t it as
efficient as Haproxy (open question)?

The simple fact that you are not actually (externaly) able to tell
if/how
many backends are down should answer your question.

You also have to use third party modules for active health checks - the
default Upstream considers a backend down only after failing (configured
amount of times) actual client requests - both varnish and haproxy
allow
you to avoid this by having such functionality in the core.

As for varnish : if you are on a static html page, then it is your browser
cache that relays you. If it is semi static, chances are that you don’t
reuse the same part several times among different users due to
personalization. And if you can split this sub part to serve something
general enough, then the time that it calls varnish to serve it, nginx
alone would have already done half the way to serve the file.

You cover only a part of “caching”.

Besides parts of html (which in my opinion using nginx with SSI is
somewhat
more complicated (due to single location/if limitations) than varnish
ESI
implementation though you can probably work arround it using the agentzh
openresty module) varnish can just work as an accelerator for static
content.

While of course nginx can do the same again you have to use a third
party
module for cache invalidation (not saying it’s a bad thing).
Also the cache residing 1:1 on the filesystem makes it problematic in
setups
where you have a lot of cachable objects. At least in my case the nginx
cache manager process took way too much resources/io when traversing the
directory tree with few milion files versus storing them all in a single
mmaped file.

Here is my scenario : I just nginx for just everyhting I have to deal
with.

Good for you, but what is the goal of your mail?

Don’t get me wrong nginx is a stellar software and one of the best
webservers but it doesnt mean it needs to do everything or sticked
everywhere even the active community and the increasing ammount of
modules
(would) allow that

rr

larry · March 28, 2013, 4:39pm

Okay,

You, as others did, gave really good reason why haproxy + varnish +
nginx
should be good together.

But seems a real hassle to setup and maintain…

Posted at Nginx Forum:

larry · March 28, 2013, 2:54pm

Did anyone had problems with upstream modules ?

There is backup servers, least_conn and other fancy things. Isn’t it as
efficient as Haproxy (open question)?

I read carefully, maybe not enough, what you all said, but, just cannot
understand how it comes nginx cannot perform as well as haproxy to serve
lot
of connections.

Tcp splicing is not really useable for everyone running on stable debian
6.

Here is my scenario : I just nginx for just everyhting I have to deal
with.
If I don’t want php, is use lua for simple things or tough rewriting.

I use nginx as a routing engine on another server. And still use it to
serve
static files on my private cdn. It doesn’t do round robin but least_conn
to
share the load evenly. My sessions are accessed by a database backend
with
memcached activated.

This setup is soooooo simple and easy to maintain !

So far so good, really easy to setup, scripts know where to
search/replace.
But i don’t want to miss anything.

As for varnish : if you are on a static html page, then it is your
browser
cache that relays you. If it is semi static, chances are that you don’t
reuse the same part several times among different users due to
personalization. And if you can split this sub part to serve something
general enough, then the time that it calls varnish to serve it, nginx
alone
would have already done half the way to serve the file.

If in this scenario Haproxy performs significantly better, then I am in
thirst of knowledge.

Cheers,

Larry

Posted at Nginx Forum:

larry · March 29, 2013, 10:56am

Yes and no, persistent cache is marked as experimental.
And actually we are testing Apache Traffic Server as cache server.

As I said before nginx is great http server and good proxy but haproxy
has
more features.
I hope that nginx will be as good as haproxy in proxy mode. But this
time
is slower and has less features
so we used it for SSL termination.

Lukas Herbolt

etnetera

On 29 March 2013 08:19, Igor S. [email protected] wrote:

Changes with nginx 1.1.0 01 Aug

nginx mailing list
[email protected]
nginx Info Page

–
Lukáš Heřbolt
Linux Administrator

Created by ET NETERA | Powered by jNetPublish

larry · March 29, 2013, 8:19am

On Mar 28, 2013, at 19:20 , Reinis R. wrote:

Also the cache residing 1:1 on the filesystem makes it problematic in setups
where you have a lot of cachable objects. At least in my case the nginx cache
manager process took way too much resources/io when traversing the directory tree
with few milion files versus storing them all in a single mmaped file.

Did you try nginx cache since version 1.1.0?

Changes with nginx 1.1.0 01 Aug
2011

*) Feature: cache loader run time decrease.

BTW, do you use Varnish persistent cache?

–
Igor S.

larry · March 29, 2013, 2:32pm

We never really use nginx in straight proxy mode - we always have some
munging or something to do to the request or response along with
cacheing,
etc. So, we’d wind up using nginx (or varnish) along with haproxy anyway
and that’s just an unneeded layer for us, right now. Apache
TrafficServer
looks interesting for similar use cases.
We get great performance from nginx for our use cases. We continually
test
other technologies, but haven’t found a reason to switch or augment it
right now. in 6 months that may change, of course.
For just straight up http proxy, I’d agree that haproxy is probably a
better fit. Once you start needing to edit header or bodies in a
programmatic fashion, I’d look at something else.

larry · March 30, 2013, 11:18am

Did you try nginx cache since version 1.1.0?

Yes, but only for fastcgi cache therefore the file count isn’t too big
to
make an impact. I’ll try the static cache again with the current version
and
see how it works out now.

BTW, do you use Varnish persistent cache?

No, just a huge mmaped file …
Since the instances get restarted very rarely (most have now over a year
of
uptime) the result is basicaly the same without the persistant storage
bad
side effects/bugs.

rr

larry · March 31, 2013, 8:06am

On Mar 30, 2013, at 14:17 , Reinis R. wrote:

BTW, do you use Varnish persistent cache?

No, just a huge mmaped file …
Since the instances get restarted very rarely (most have now over a year of
uptime) the result is basicaly the same without the persistant storage bad side
effects/bugs.

How much the cache’s size is larger than the host’s physical memory?

–
Igor S.

larry · March 31, 2013, 12:12pm

How much the cache’s size is larger than the host’s physical memory?

32Gb ram and 240Gb (fits on a ssd) mapped file (no swapping involved).

rr

larry · April 2, 2013, 11:55am

Did you try previously nginx cache also on SSD or on usual hard disk?

Tbh I don’t remember as it was a while ago (on 0.7.x), it might have
been a
regular SAS system instead (which of course is not as speedy as ssd and
objective to compare).
But as I said I’ll test the current and see how it goes.

rr