I'm working on a project where it's critical to minimize the possibility of a single point of failure, and where there will be quite high traffic. Currently in another version of the system we're using nginx as a remote proxy server for Tomcat, but the current plan is to use a hardware load balancer in front of a Tomcat cluster (or a cluster of nginx+Tomcat instances). I'm wondering, though, given the extraordinary performance and reliability of nginx, whether we might be able to omit the hardware load-balancer and use instead a couple of dedicated minimal nginx servers with failover between them. If anyone has gone down this path and has some good ideas and/or useful experience, I'd be keen to hear from them.
on 2009-09-15 16:53
on 2009-09-15 17:31
Not sure if this is possible ( as I haven't tried it) but what about building nginx on Damn Small Linux and having a boot cd or ramdisk, or even boot flash. You could literally take something like a PowerEdge 1425 or so and have a kicking minimalistic LB hardware running on nginx. Technically if you were so inclined, you could even write DSL and nginx to a prom chip so its 100% automated, I'm better if nginx does everything you need it would be a lot cheaper than the hardware normal route with the same if not better stability. Personally what I would do is (assuming you have ESX), run 2 VM's both running nginx on dedicated NICs. Then one your switching set up an active/active fail over to those nice ( and have the VM's on separate ESX hosts). You would then have a fully redundant LB system so if nginx on one node crashes the fail over would route all traffic to the other LB. This might require a little coding to nginx, to cause on OPPS on errors , so that the node would reboot. Just my thoughts. David
on 2009-09-15 22:10
David Murphy wrote: > Not sure if this is possible ( as I haven't tried it) but what about > building nginx on Damn Small Linux and having a boot cd or ramdisk, or > even boot flash. You could literally take something like a PowerEdge > 1425 or so and have a kicking minimalistic LB hardware running on nginx. > > > Yes, that's a good idea. Is DSL the best distro for such things? > Personally what I would do is (assuming you have ESX), run 2 VM's both > running nginx on dedicated NICs. Then one your switching set up an > active/active fail over to those nice ( and have the VM's on separate ESX > hosts). > You would then have a fully redundant LB system so if nginx on one node > crashes the fail over would route all traffic to the other LB. > > I hadn't thought to explore the possibility of virtualization. That's given me some good foood for thought, thanks. J
on 2009-09-15 22:11
>David Murphy wrote: >> Not sure if this is possible ( as I haven't tried it) but what about >> building nginx on Damn Small Linux and having a boot cd or ramdisk, or >> even boot flash. You could literally take something like a PowerEdge >> 1425 or so and have a kicking minimalistic LB hardware running on nginx. >> >> >> >Yes, that's a good idea. Is DSL the best distro for such things? Well not necessarily but it does have the smallest foot print, thus needing less on chip memory and lowering cost of creating such an appliance, heck technically speaking you could get a WRT-54G, turn off the wireless after installing DD-WRT ( DSL variant), and compile nginx into it. Would be an interesting proof of concept for sure. >> Personally what I would do is (assuming you have ESX), run 2 VM's >> both running nginx on dedicated NICs. Then one your switching set up >> an active/active fail over to those nice ( and have the VM's on >> separate ESX hosts). >> You would then have a fully redundant LB system so if nginx on one node >> crashes the fail over would route all traffic to the other LB. >> >> >I hadn't thought to explore the possibility of virtualization. That's given me some good foood for thought, thanks.
on 2009-09-15 22:33
On Tuesday, September 15, 2009 at 18:19:38, David Murphy wrote: DM> Not sure if this is possible ( as I haven't tried it) DM> but what about building nginx on Damn Small Linux and having DM> a boot cd or ramdisk, or even boot flash. You could literally take DM> something like a PowerEdge 1425 or so and have a kicking minimalistic DM> LB hardware running on nginx. DSL - Desktop OS, linux distro for i486 with 2.4.x linux kernel, optimized for minimal RAM usage and old computers. no linux 2.6.x kernel - means no "epoll" at all. therefore - DSL is totally useless for high traffic load balancer as base OS. DM> Technically if you were so inclined, you could even write DSL and nginx DM> to a prom chip so its 100% automated, I'm better if nginx does everything DM> you need it would be a lot cheaper than the hardware normal route with the DM> same if not better stability. question was not about most cheaper "solution", but about "high traffic LB". DM> Personally what I would do is (assuming you have ESX), run 2 VM's both DM> running nginx on dedicated NICs. Then one your switching set up an DM> active/active fail over to those nice ( and have the VM's on separate ESX DM> hosts). DM> You would then have a fully redundant LB system so if nginx on one node DM> crashes the fail over would route all traffic to the other LB. if, for example, crashes mainboard of esx server with these VM's - both VM's go down. so, this is not "a fully redundant LB system". hardware of ESX server is "single point of failure".
on 2009-09-15 22:57
We are currently running Nginx as a front end LB to a single PHP App Server. When we need a bit more horse power, I can divert traffic to the LB itself to process PHP requests with a minor change in the config. We are serving close to 2mm page views a day and growing. The Nginx configuration is a dual core, dual CPU AMD processor system, 32 bit with 8GB of RAM. We are also using it for delivery of static assets (everything except PHP). The CPU utilization on the box at peak is below 3% and nginx processing 3000+ connections at any given time. I also configured the PHP app server as a Nginx server and it serves as a backup in case the primary nginx server fails. I'm very happy with the confguration, adding another PHP app server will be a breeze, although session management is going to be difficult.
on 2009-09-15 23:22
Gena, Regarding ESX you are completely wrong, as I mentioned each VM would be on their own HOST which means a the entire cluster would have to fail to cause the LB to not switch over. Also DSL has been ported to 2.6 also. To be 100% accurate DSL = 2.4 while DSL-N = 2.6 http://www.damnsmalllinux.org/dsl-n/ I was mentioning the cost difference in regards to building your own hardware based solution using a nginx setup as your LB versus paying for hardware. To show how he could use nginx in an appliance manner . This would yield a better ROI and allow for more fail over. Please read more closely when making assumptions one SPF , as you could very easily send the wrong impression to someone who is new to the virtualization space. Since you did not include the full quote " Personally what I would do is (assuming you have ESX), run 2 VM's both running nginx on dedicated NICs. Then one your switching set up an active/active fail over to those nice ( and have the VM's on separate ESX hosts)." Was actually what I had said. David
on 2009-09-15 23:35
Here is our current setup: 2 ESX HOSTs 4 QUAD CORE XEON 55xx series chips 1 SAN with DUAL CONTROLLERS and Power supplies for each controller routed to separate switches, which interconnect to the ESX Hosts In the LB VM's we use OCFS2 to create a shared session folder that the Apache 2 / PHP 5.2 backend read from. The LB only do LB not PHP serving ( I found apache using a PHP module worked better for that) Also in our LB configuration if a request to a upstream fails it will try again via a RW rule, allowing the bad upstream to be removed, and the pool to reprocess the request. Thus an end user will never get a proxy error page. Does your kernel use PAE, if not you need to move to PAE or a 64 bit kernel to really utilize your ram. I would love to go to a true nginx environment, if only nginx could just make PHP a module and not do spawning of PHP . Hope this helps some. BTW We currently use JEOS 8.04 for this task but im moving to CentOS for better support from Dell and VMware (CentOS is on the approved distro list for both of them.) David From: owner-nginx@sysoev.ru [mailto:owner-nginx@sysoev.ru] On Behalf Of Ilan Berkner Sent: Tuesday, September 15, 2009 3:42 PM To: nginx@sysoev.ru Subject: Re: Viability of nginx instead of hardware load balancer? We are currently running Nginx as a front end LB to a single PHP App Server. When we need a bit more horse power, I can divert traffic to the LB itself to process PHP requests with a minor change in the config. We are serving close to 2mm page views a day and growing. The Nginx configuration is a dual core, dual CPU AMD processor system, 32 bit with 8GB of RAM. We are also using it for delivery of static assets (everything except PHP). The CPU utilization on the box at peak is below 3% and nginx processing 3000+ connections at any given time. I also configured the PHP app server as a Nginx server and it serves as a backup in case the primary nginx server fails. I'm very happy with the confguration, adding another PHP app server will be a breeze, although session management is going to be difficult. On Tue, Sep 15, 2009 at 4:16 PM, Gena Makhomed <gmm@csdoc.com> wrote: On Tuesday, September 15, 2009 at 18:19:38, David Murphy wrote: DM> Not sure if this is possible ( as I haven't tried it) DM> but what about building nginx on Damn Small Linux and having DM> a boot cd or ramdisk, or even boot flash. You could literally take DM> something like a PowerEdge 1425 or so and have a kicking minimalistic DM> LB hardware running on nginx. DSL - Desktop OS, linux distro for i486 with 2.4.x linux kernel, optimized for minimal RAM usage and old computers. no linux 2.6.x kernel - means no "epoll" at all. therefore - DSL is totally useless for high traffic load balancer as base OS. DM> Technically if you were so inclined, you could even write DSL and nginx DM> to a prom chip so its 100% automated, I'm better if nginx does everything DM> you need it would be a lot cheaper than the hardware normal route with the DM> same if not better stability. question was not about most cheaper "solution", but about "high traffic LB". DM> Personally what I would do is (assuming you have ESX), run 2 VM's both DM> running nginx on dedicated NICs. Then one your switching set up an DM> active/active fail over to those nice ( and have the VM's on separate ESX DM> hosts). DM> You would then have a fully redundant LB system so if nginx on one node DM> crashes the fail over would route all traffic to the other LB. if, for example, crashes mainboard of esx server with these VM's - both VM's go down. so, this is not "a fully redundant LB system". hardware of ESX server is "single point of failure".
on 2009-09-15 23:35
On Tue, 2009-09-15 at 15:41 +0100, John Moore wrote: > I'm working on a project where it's critical to minimize the possibility > of a single point of failure, and where there will be quite high > traffic. Currently in another version of the system we're using nginx as > a remote proxy server for Tomcat, but the current plan is to use a > hardware load balancer in front ...which is a single point of failure too. And the "hardware" in "hardware load balancer" isn't so "hardware" as people tend to think. It's much closer to "common hardware, without redundant power supplies, without RAIDs, with the base system on the flashcard and with the SSL accelerator card". You can make one at 1/10th of the price of the "hardware" one yourself, maybe 1/5th if you purchase the same SSL accelerator card, which is supported by the OS you would use to make your own hardware load balancer. > of a Tomcat cluster (or a cluster of > nginx+Tomcat instances). > I'm wondering, though, given the extraordinary > performance and reliability of nginx, whether we might be able to omit > the hardware load-balancer and use instead a couple of dedicated minimal > nginx servers If you really want only load balancing ( no proxying/caching/SSL acceleration ) you can make damn fast and easy lvl2 load balancing using BSD packet filter pf and relayd ( with backend monitoring ); if you want failover setup you can also have it with pf's pfsync and carp ( VRRP implementation ). You can also give up on "hardware firewalls" in front of it and use the same pf to protect whole environment behind. Put that on frontends and put nginx/tomcat on the backends as you planned.
on 2009-09-16 10:19
On Wednesday, September 16, 2009 at 0:09:50, David Murphy wrote: DM> Regarding ESX you are completely wrong, as I mentioned each VM DM> would be on their own HOST which means a the entire cluster DM> would have to fail to cause the LB to not switch over. yes, this is my mistake, sorry. DM> Also DSL has been ported to 2.6 also. DM> To be 100% accurate DSL = 2.4 while DSL-N = 2.6 DM> http://www.damnsmalllinux.org/dsl-n/ you are joking ? or you seriously think what "DSL-N version 0.1 RC3" now ready for production use as load balancer base OS? DM> I was mentioning the cost difference in regards to building your own DM> hardware based solution using a nginx setup as your LB versus paying for DM> hardware. To show how he could use nginx in an appliance manner. This DM> would yield a better ROI and allow for more fail over. such production of hardware based solution and continuous support can be "a lot cheaper" only if your time and work cost nothing. also, what about requested "no single point of failure" in this case?
on 2009-09-16 16:14
Gena, DSL was only an example of one distro ( good for testing to prove a concept). Also for no SPF, you would do the same thing I suggested with VM but with physical 1U boxes, were your network could provide the fail over, and you would simply have 2 very cheap nginx based load balancer so quick fail over if one node had an issue. The question was on viability not on application, which is why I was outlining the difference methodologies to accomplish this. Regard cost, that all depend on your setup. You could have multiple failing over hardware nodes that on boot pull a config from a repo somewhere. And Have an internal system to reconfigure the conf and commit it to the repo with changes ( If using SVN you could then have a post-commit hook to tell nginx to get the new file the reload) So you could actually keep you Admin time very low, as low if not lower than learning and administrating a purchased hardware system. It the age old plan for growth, and you can feed an army, fail to plan and starve the army. David
on 2009-09-16 22:45
On Wednesday, September 16, 2009 at 17:03:58, David Murphy wrote:
DM> DSL was only an example of one distro
DM> ( good for testing to prove a concept).
this is just wasting of time, no?
build and test on legacy 2.4.x kernel and after doing it -
again build and test on 2.6.x kernel before production use.
make production high traffic load-balancer
on legacy 2.4.x kernel - is not good idea.
DM> Also for no SPF, you would do the same thing I suggested with VM but
with
DM> physical 1U boxes, were your network could provide the fail over,
and you
DM> would simply have 2 very cheap nginx based load balancer
DM> so quick fail over if one node had an issue.
if use "very cheap hardware" for "high traffic load-balancer" -
very cheap hardware may have low reliability and very low performance.
absence of failures and absence of lags/overloads
IMHO has more priority over "hardware low price".
especially if in future load balancers
feel very high load under DDoS attacks.
===================================================================
On Tuesday, September 15, 2009 at 23:01:04, David Murphy wrote:
>>Yes, that's a good idea. Is DSL the best distro for such things?
DM> Well not necessarily but it does have the smallest foot print, thus
needing
DM> less on chip memory and lowering cost of creating such an
appliance, heck
DM> technically speaking you could get a WRT-54G, turn off the wireless
after
DM> installing DD-WRT ( DSL variant), and compile nginx into it. Would
be an
DM> interesting proof of concept for sure.
===================================================================
WRT-54G has slow CPU, low RAM, and it is bad candidate for load balancer
hardware.
using DSL( or DSL-N ) and WRT-54G as high traffic load-balancer
software and hardware is useless and harmful recommendations, IMHO.
but using for frontends modern hardware and OS allow use its
at least also for nginx caching in order to reduce
backends load and quantity, and so on...
on 2009-09-17 07:17
On Sep 15, 2009, at 9:41 AM, John Moore wrote:
> ideas and/or useful experience, I'd be keen to hear from them.
We are using nginx as a reverse proxy (load balancer) serving tens of
thousands of requests per second across various large sites
(WordPress.com, Gravatar.com, etc). We deploy our nginx reverse
proxies in active-active pairs using Wackamole and Spread to control
the floating IPs for high availability. Our busiest load balancers
(req/sec) are serving about 7000 req/sec and the most traffic per
machine is in the 600Mbit/sec range. We could push each machine more,
they aren't maxed out, but we like to leave some room for growth, DoS
attacks, hardware/network failures, etc. The bottleneck for us seem
to be the large number of software interrupts on the network
interfaces cause the boxes to become CPU bound at some point. I am
not sure how to reduce this, it seems like a necessary evil of running
something like this in user space. I have wanted to try FreeBSD 7 to
see if it performs better in this area, but haven't had a chance yet
(we are running Debian Lenny mostly).
We are using "cheap" commodity hardware.
2 x Quad-core AMD or Intel CPUs
2-4GB of RAM
Single SATA drive
2 x 1000Mbit NICs
Since it is so easy to deploy more servers, it's super easy to scale,
and this configuration has been ultra-reliable for us. Most of the
failures we have had are from human error.
Hope this helps,
Barry
on 2009-09-17 13:01
Barry Abrahamson wrote: >> be able to omit the hardware load-balancer and use instead a couple > machine is in the 600Mbit/sec range. We could push each machine more, > > > It certainly does, thanks! Could I trouble you to explain a little more about your use of Wackamole and Spread? I've not used either of them before. Also, is there any reason why a hosting company would have problems with such a setup (i.e., this won't be running in our hardware on our premises, but we have full control of Linux servers).
on 2009-09-17 16:14
If your load balancer is not doing anything but being a load balancer really you only needs high quality network devices, a minimalistic kernel ( to prevent security holes) and ram based os or fast HD. I would agree you need better hardware if you are doing more but as a pure LB , hardware requirements and not a strict as you are implying. Furthermore you need to get past this concept of "failures" because if you configured things properly you would have a hot spare device to prevent any such lag. I find that buying a single piece of hardware vs building out a redundant infrastructure a) costs more money and b) actually have a higher chance of failure due to a Single Point of Failure Also you have the ability via networking to run a true balanced share out of the load so you could have 3 LB's all getting one 1/3 of all requests and hitting the same backends. Then if one drops off the switch just downs the port and 1/2 goes to each of the remaining LB nodes. Your belief you must buy hardware is just a waste of capital investment , when you can build it yourself for much cheaper with the same of better hardware than buying something from a vendor.
on 2009-09-17 16:17
Well if you are planning on not hosting it , some hosting providers will set you up with networking to do go HA, some will not, you should get a list of candidates and ask them directly. For example for a fee the planet will install cross connects to the LBs and we nodes and if you're in a cabinet let you have your own switch to setup port fail over or also known as floating ips ( bonding)
on 2009-09-21 13:03
On Thursday, September 17, 2009 at 17:05:28, David Murphy wrote: DM> If your load balancer is not doing anything but being a load balancer DM> really you only needs high quality network devices, a minimalistic kernel DM> (to prevent security holes) and ram based os or fast HD. WRT-54G has very limited amount of memory - 8 MB, 16MB or 32MB. this is not proper hardware for high traffic http load balancer, you can test it independently if not believe my humble opinion. DM> I would agree you need better hardware if you are doing more DM> but as a pure LB, hardware requirements and not a strict as you are implying. even pure http LB need a quite lot of RAM for tcp buffers and states of connections. DM> Furthermore you need to get past this concept of "failures" because if you DM> configured things properly you would have a hot spare device to prevent DM> any such lag. I find that buying a single piece of hardware vs building DM> out a redundant infrastructure a) costs more money and b) actually have a DM> higher chance of failure due to a Single Point of Failure lags/overloads - because of low performance of very cheap hardware (slow CPU/not enough RAM). failures - because of low reliability of very cheap hardware (obsolete equipment). DM> Also you have the ability via networking to run a true balanced share out DM> of the load so you could have 3 LB's all getting one 1/3 of all requests DM> and hitting the same backends. Then if one drops off the switch just downs DM> the port and 1/2 goes to each of the remaining LB nodes. in case of persistent failure - yes, one drops off all active tcp connections at that moment and go down. even persistent failure dont have zero cost - it generates temporary denial of service for all clients of these connections. but, for exampe, in case of broken memory chips - failed LB not go down, and continue to "work", generating broken ip packets or reboots/kernel panics. DM> Your belief you must buy hardware is just a waste of capital investment, DM> when you can build it yourself for much cheaper with the same of better DM> hardware than buying something from a vendor. I belief what WRT-54G is not appropriate hardware for load balancer and DSL (2.4 kernel) DSL-N (development version) is not appropriate base OS for load balancer, even if using high quality nginx server. I belief need to be minimized not cost of some piece of hardware, but minimize TCO of solution, if all QoS requirements satisfacted and scaling of any part of system provided.
on 2009-09-22 17:49
Once again Gena you missed my point by a landslide. WRT/DSL were to show proof it could be done and very low end hardware, you could always get better hardware. Furthermore you fail-over logic is flawed, if done proper you can prevent the failure pages to the end user. However the point was not what the best solution in a case would be but the overall viability . I was simply shows a very basic approach improvements could be made. You thought process of expecting it to be a fully planned out system, is asking Alexander Gram-Bell how a cell phone tower would work, not could it be possible in the future to have phones without wires, if such a technology could be made? However I digress this topic is dead at this point the user has been shown several different ways where using nginx vs a hardware platform could be done. Its not his job to plan and test it for his needs. Theory over practice and all that.
on 2009-09-22 19:55
David Murphy wrote: > However I digress this topic is dead at this point the user has been shown > several different ways where using nginx vs a hardware platform could be > done. Its not his job to plan and test it for his needs. Theory over > practice and all that. > > I'm not quite sure what you mean by this (unless you meant to type 'now' instead of 'not'). I'm very grateful for the help given here, which has made me more confident that we can use nginx for the task. It's now time to do some testing. JM
on 2009-09-23 19:32
On Monday, September 21, 2009 at 23:00:33, David Murphy wrote: DM> Once again Gena you missed my point by a landslide. WRT/DSL were to show DM> proof it could be done and very low end hardware, you could always get DM> better hardware. high traffic load balancer couldn't be done at WRT/DSL. original question was about high traffic load balancer. this is reason, why games with WRT/DSL are just wasting of his time and money. experience with WRT/DSL not need to use nginx on better hardware and better OS. WRT and DSL - has limited capabilities. just warning from me, nothing personal. DM> Furthermore you fail-over logic is flawed, if done proper DM> you can prevent the failure pages to the end user. for future new connections - yes, can prevent, for active tcp sessions/transmissions - can't. because tcp connection states and http states are not shared between independent nodes of nginx-based load balancer cluster. each hardware node of nginx-based load balancer cluster may have several hundred or thousand active tcp connections. this is reason why persistent failure of one node don't have zero cost even if other nodes are live.
on 2009-09-24 17:11
On Sep 17, 2009, at 5:49 AM, John Moore wrote: > It certainly does, thanks! Could I trouble you to explain a little > more about your use of Wackamole and Spread? I've not used either of > them before. There is a How-to here: http://www.howtoforge.com/setting-up-a-high-availability-load-balancer-with-haproxy-wackamole-spread-on-debian-etch-p2 You are just using nginx instead of HAProxy, but the Wackamole and Spread portion still applies. Scalable Internet Architectures ( http://www.amazon.com/Scalable-Internet-Architectures-Theo-Schlossnagle/dp/067232699X ) also has a section on how this works. > Also, is there any reason why a hosting company would have problems > with such a setup (i.e., this won't be running in our hardware on > our premises, but we have full control of Linux servers). Yes, you have to be a little careful here and ask questions up front. A lot of hosting companies segment their switches such that each port is it's own VLAN which means you can't "float" IPs between ports which is what you need for this to work. If you tell your hosting company what you are trying to do and tell them that you need to be able to have IPs which are programmatically moved between switch ports they should be able to tell you if this is possible or not. Some hosts may require you have some sort of "private rack" or other upgrade to make this possible. Barry
on 2009-09-24 18:00
My experiences with spread were less than stellar, but instead of going into that, I'll just give a piece of advice. Spread first tries to communicate using multicast, and then falls back to broadcasting. At my hosting provider, since their equipment didn't support multicast, this meant that, even though communications were only going between two computers and did not need to be broadcast to everyone, all communications were being broadcast to everyone on the subnet. It didn't take long before my hosting provider null routed my server. You can override this behaviour by telling spread to communicate using unicast, but this only works if there is only one destination for each source piece of information. Just something to keep in mind -Gabe
on 2009-09-24 19:14
On Thu, Sep 24, 2009 at 8:46 AM, Gabriel Ramuglia <gabe@vtunnel.com> wrote: > source piece of information. >> >> >> if this is possible or not. Some hosts may require you have some sort of >> >> > > why not just ask for your own private vlan? a private vlan will not only create a boundry around your unciast/broadcast traffic but it will also allow you to have your own ip unshared ip space (as appose to shared vlan/shared ip space). Also, private vlan will give you the frameworkf or moving your ip space anywhere you want inside the network. In regards to floating ip, just hava them provision you on a layer2 segment, that will allow you to have multiple ports on their netowrk, in the same private vlan, in different locations
on 2009-09-24 19:39
Another problem with the floating ip is locking arp. The routers on my host lock the arp for a given ip to whichever mac address it first hears claiming to have that ip, so I can't switch ips on the same segment between machines without talking to them first (or presumably letting the arp entry expire)
on 2009-09-24 22:03
For that you would likely want the DC to setup HSRP so you would have port fail over, which would allow for a re-arp, but preventing a "arpstorm" David
on 2009-10-03 10:47
On Sep 24, Barry Abrahamson wrote: > > You are just using nginx instead of HAProxy, but the Wackamole and Spread > portion still applies. How about using one of the LVS solutions? The problem I find with wackamole is that it assumes the host is "ok" if the network is reachable. I'd rather have the heartbeat check work off something more concrete like nginx being up and being able to serve a pre defined static page.
Please log in before posting. Registration is free and takes only a minute.
Existing account
(Switch to SSL-encrypted connection)
NEW: Do you have a Google/GoogleMail or Yahoo account? No registration required!
Log in with Google account | Log in with Yahoo account
Log in with Google account | Log in with Yahoo account
No account? Register here.