But I still cannot understand why, in 2013 and with the latest version of
nginx, we would still need haproxy in front of it.
You don’t need it is just a thing of preference or needs / that is also
why
we don’t have a single webserver or database server software.
But to name few advantages (at least for me) why I would use (and am
doing
it) haproxy as a balancer - it has more refined backend
status/administrative page ( http://demo.1wt.eu/ (without the admin
features)).
The nginx upstream module is lacking in this area and for now (as far as
I
know) you can only get info only via logging.
You have detailed information of what’s up and what’s down / how many
failures there have been. Also you can easily bring down any backends
without the need to change configuration (in case of nginx would need to
rewrite the config and do a reload).
Even Varnish… nginx can cache too.
As to varnish - I preffer the memory mapped file instead of nginx
approach
of creating a file for each cachable object in filesystem.
Hi,
actually in our setup we use NGINX as SSL termination before HAProxy.
HAProxy have some features that Nginx still doesn’t have. Like backend
max
connections and frontend queue. So you can do throtlling to prevent your
backend
server high load and keep request from client in front. So the didn’t
get
HTTP 500.
Another feature is splice system call, which makes HAProxy really fast
with
low system load.
haproxy has a huge list of features for reverse proxying that nginx
hasn’t, varnish has the same for caching.
If you can do everything with nginx, go for it. But for more complex
scenarios and if you really need the highest possible performance,
you probably wanna stick to what the particular product does best.
For example: haproxy does tcp splicing, that means the http payload
not even touches the user-space, and the kernel just does a zero
copy. Are you able to forward 20Gbps with nginx on a single machine?
I doubt that.
Why would you doubt that? Of course, my machines may be bigger than the norm…
Because nginx doesn’t do tcp splicing. Is my assumption wrong; are you
able to
forward 20Gbps with nginx? Then yes, probably you have huge hardware,
which isn’t
necessary with haproxy.
Hi
I actually did some quite in-depth comparison with splice() sys call
(only available on linux btw), between nginx and haproxy, and even wrote
a small standalone proxy server that uses it
There was some improvement, but not on the scale that would make it a
deciding factor
The thing that makes most difference to forwarding is your network card,
and if it supports LRO (large receive offload) - if you’re using a 10G
lan card, it probably has it, anything less probably doesn’t
I’ve attached my results, the test was proxying a file a certain amount
of times, and I would log how much cpu time was used (ab -n 1000 -c 10
192.168.1.101:8001/10MB.zip)
RTL = onboard realtek (they are crap)
INTEL = intel 1000CT ($30 thing)
LIN = Linux (3.6.something)
BSD = FreeBSD 9.0
HA = Haproxy (latest 1.5 dev version at the time)
NGX = Nginx 1.3.something
PS = splice() proxy that I wrote
SPL/BUF/OFF = mode either splice, buffer or off/on (nginx
proxy_buffering)
Afterwards I got some 10G cards to test and it was (by probably 80-90%)
faster at all tests
Why would you doubt that? Of course, my machines may be bigger than the norm…
Because nginx doesn’t do tcp splicing. Is my assumption wrong; are you able to
forward 20Gbps with nginx? Then yes, probably you have huge hardware, which
isn’t
necessary with haproxy.
Just curious, are you referring to “splice-auto” or just
“splice-response”?
I’d assume “splice-response” sort of disables response buffering and it
might be useful indeed if you’ve got fast clients and fast servers. I
wonder know what happens with slow clients/fast servers tho
Also, to the best of my understanding, both Linux kernel version and
network card present a lot of specifics in regards to how splice is
used.
Also, to the best of my understanding, both Linux kernel version and
network card present a lot of specifics in regards to how
splice is used.
Kernel, yes. The fist splice was implemented in 2.6.17 but it was buggy.
So
it is not recommended to use it.
Reimplementation was done in 3.5 and since that version everything works
fine.
I’m not sure how much it depends on NIC. I assume it would’n be much
difference,
more importatnt is tcp offloading support.
There is backup servers, least_conn and other fancy things. Isn’t it as
efficient as Haproxy (open question)?
The simple fact that you are not actually (externaly) able to tell
if/how
many backends are down should answer your question.
You also have to use third party modules for active health checks - the
default Upstream considers a backend down only after failing (configured
amount of times) actual client requests - both varnish and haproxy
allow
you to avoid this by having such functionality in the core.
As for varnish : if you are on a static html page, then it is your browser
cache that relays you. If it is semi static, chances are that you don’t
reuse the same part several times among different users due to
personalization. And if you can split this sub part to serve something
general enough, then the time that it calls varnish to serve it, nginx
alone would have already done half the way to serve the file.
You cover only a part of “caching”.
Besides parts of html (which in my opinion using nginx with SSI is
somewhat
more complicated (due to single location/if limitations) than varnish
ESI
implementation though you can probably work arround it using the agentzh
openresty module) varnish can just work as an accelerator for static
content.
While of course nginx can do the same again you have to use a third
party
module for cache invalidation (not saying it’s a bad thing).
Also the cache residing 1:1 on the filesystem makes it problematic in
setups
where you have a lot of cachable objects. At least in my case the nginx
cache manager process took way too much resources/io when traversing the
directory tree with few milion files versus storing them all in a single
mmaped file.
Here is my scenario : I just nginx for just everyhting I have to deal
with.
Good for you, but what is the goal of your mail?
Don’t get me wrong nginx is a stellar software and one of the best
webservers but it doesnt mean it needs to do everything or sticked
everywhere even the active community and the increasing ammount of
modules
(would) allow that
There is backup servers, least_conn and other fancy things. Isn’t it as
efficient as Haproxy (open question)?
I read carefully, maybe not enough, what you all said, but, just cannot
understand how it comes nginx cannot perform as well as haproxy to serve
lot
of connections.
Tcp splicing is not really useable for everyone running on stable debian
6.
Here is my scenario : I just nginx for just everyhting I have to deal
with.
If I don’t want php, is use lua for simple things or tough rewriting.
I use nginx as a routing engine on another server. And still use it to
serve
static files on my private cdn. It doesn’t do round robin but least_conn
to
share the load evenly. My sessions are accessed by a database backend
with
memcached activated.
This setup is soooooo simple and easy to maintain !
So far so good, really easy to setup, scripts know where to
search/replace.
But i don’t want to miss anything.
As for varnish : if you are on a static html page, then it is your
browser
cache that relays you. If it is semi static, chances are that you don’t
reuse the same part several times among different users due to
personalization. And if you can split this sub part to serve something
general enough, then the time that it calls varnish to serve it, nginx
alone
would have already done half the way to serve the file.
If in this scenario Haproxy performs significantly better, then I am in
thirst of knowledge.
Yes and no, persistent cache is marked as experimental.
And actually we are testing Apache Traffic Server as cache server.
As I said before nginx is great http server and good proxy but haproxy
has
more features.
I hope that nginx will be as good as haproxy in proxy mode. But this
time
is slower and has less features
so we used it for SSL termination.
Also the cache residing 1:1 on the filesystem makes it problematic in setups
where you have a lot of cachable objects. At least in my case the nginx cache
manager process took way too much resources/io when traversing the directory tree
with few milion files versus storing them all in a single mmaped file.
We never really use nginx in straight proxy mode - we always have some
munging or something to do to the request or response along with
cacheing,
etc. So, we’d wind up using nginx (or varnish) along with haproxy anyway
and that’s just an unneeded layer for us, right now. Apache
TrafficServer
looks interesting for similar use cases.
We get great performance from nginx for our use cases. We continually
test
other technologies, but haven’t found a reason to switch or augment it
right now. in 6 months that may change, of course.
For just straight up http proxy, I’d agree that haproxy is probably a
better fit. Once you start needing to edit header or bodies in a
programmatic fashion, I’d look at something else.
Yes, but only for fastcgi cache therefore the file count isn’t too big
to
make an impact. I’ll try the static cache again with the current version
and
see how it works out now.
BTW, do you use Varnish persistent cache?
No, just a huge mmaped file …
Since the instances get restarted very rarely (most have now over a year
of
uptime) the result is basicaly the same without the persistant storage
bad
side effects/bugs.
No, just a huge mmaped file …
Since the instances get restarted very rarely (most have now over a year of
uptime) the result is basicaly the same without the persistant storage bad side
effects/bugs.
How much the cache’s size is larger than the host’s physical memory?
Did you try previously nginx cache also on SSD or on usual hard disk?
Tbh I don’t remember as it was a while ago (on 0.7.x), it might have
been a
regular SAS system instead (which of course is not as speedy as ssd and
objective to compare).
But as I said I’ll test the current and see how it goes.