Hi, Following Ezra's advice in his book "rails deployment", I have decided to go through all the pain of setting up a server virtualized with Xen, and separate each layer of the application in its own VM. So I have 4 VMs: Nginx, Thin, PostgreSQL and one for email The advantages are: - if a VM crashes, the others are still alive - I can setup a testing VM, mess things up, compile, break stuff without any fear - I can upgrade the server quickly by moving the VMs on another box The drawbacks: - Xen is an absolute pain to setup - Setting up each VM is tedious - Nginx can't talk to Thin over Unix sockets, and moreover you need to setup NFS to share static files between both - you must setup the DB to be remotely accessible, and cannot use Unix socket for communication - you have to manage keeping each VM updated - actually upgrading the server is not easy as moving the VMs on another box, there will still be a good downtime. Each time you have more steps to do compared to a single server with all the software running directly on a single OS. Well separating each VM like I did is I find a bad idea, everything is just more painful. How do you setup your app? Single server VS multiple?
on 2008-10-22 02:44
on 2008-10-22 09:28
Hi Fernando, I've been using Xen for almost a year now and we've been really happy with it. We have >50 virtual servers running on three physical servers. Isolating the servers from one another provides numerous benefits: # One server consuming too much CPU or RAM will not adversely affect the others We don't want an app that chews up RAM to affect other services (like db). We allocate a certain amount of RAM to each slice and can change the allocation when required. Nightly backups chew up CPU but our backup slice doesn't affect the other slices running on the same server as CPU is shared fairly*. # We can manage disk more easily Isolation of disk between virtual servers is a benefit as it reduces the damage a full disk would cause (our monitoring alerts us well before that happens though!). # Security We expose nginx to the world, but our app servers, database, etc are on private IPs. This reduces our exposure to attack. Also, putting different services on different (virtual) servers means that if one service have an vulnerability it doesn't necessarily mean that other others will be vulnerable. If an exploit is found for your mailserver you don't want that to mean your database gets owned. There are a few things that I've still not achieved: # Migrating a live server from one physical server to another This requires a shared filesystem. I couldn't get GFS working on Ubuntu. I'm wary of NFS as I hear it can be unstable under high load. I'd love to have it but I'm still waiting for the HOWTO to be written. # Sharing a filesystem between slices Same as above. This would be needed to allow separate app servers to handle fileuploads. In response to your drawbacks: - Xen is an absolute pain to setup Once you know what you're doing it's pretty easy but it takes some learning. Possibly not worth the time if you only have one app and don't need it to scale. - Setting up each VM is tedious Check out xen-tools for creating the VMs. Check out deprec automating your setup. - Nginx can't talk to Thin over Unix sockets, and moreover you need to setup NFS to share static files between both nginx talks to mongrel over http. I don't know anything about thin. - you must setup the DB to be remotely accessible, and cannot use Unix socket for communication You can make it remotely accessible on a private subnet that nginx can access. I can't see any problem with this. - you have to manage keeping each VM updated When you automate your processes, updating 10 servers is no more work than updating 1. Plus you have record of what you've done and it's repeatable. Putting the processes in place requires time and thought though. - actually upgrading the server is not easy as moving the VMs on another box, there will still be a good downtime. Yep, putting more RAM into the physical server requires me to schedule downtime for all the slices. I was sold on the "live migration" but it's never become a reality for me (yet). Still, it's no worse than without virtualization. I've learnt to fill it with RAM when we buy it. :-) Xen is cool. It can be useful. It may be that your needs are not currently great enough to justify the investment of time getting to know it though. I'd love to hear from anyone who's got live migration working with Xen on Ubuntu or Debian. - Mike * Yes, you can use 'nice' on a single server to manage CPU usage. On Wed, Oct 22, 2008 at 9:44 AM, Fernando P. <
on 2008-10-23 02:18
Mike raises some very good points (and a few that I don't agree with as well :) ), about using the virtualized setup. One important thing to keep in mind is that being able to scale shouldn't cause you to "over-scale" at the beginning. Start small and work your way up. Since both messages are included here, I've got replies going to two different people in some cases. Sorry for any confusion. > I've been using Xen for almost a year now and we've been really happy with > it. We have >50 virtual servers running on three physical servers. Unless you have a lot of CPU cores and/or a REALLY fast SAN, you're overloading your hardware. While you *can* put a virtually unlimited number of domU's on a dom0, it doesn't necessarily mean that you *should*. :) I suggest keeping it to no more than one domU per CPU core unless you have a good reason to do otherwise. > # One server consuming too much CPU or RAM will not adversely affect the > others > > We don't want an app that chews up RAM to affect other services (like db). > We allocate a certain amount of RAM to each slice and can change the > allocation when required. See the statement above. Depending on how many CPU cores you have, one server consuming too much CPU time has the potential to adversely affect other servers. If it's financially feasible, it is much preferable to have *at least* one dedicated CPU core to each domU. > Nightly backups chew up CPU but our backup slice doesn't affect the other > slices running on the same server as CPU is shared fairly*. This is definitely an example of a *good* setup. We have a few applications that run some extremely CPU-intensive rake tasks, and those are run on separate domU's so that the primary application won't get bogged down. Note that you could accomplish the same thing by just tying processes on one domU to a particular processor, but that takes additional work. As a side note, though, if your dom0 has only one or two cores, the above wouldn't help you. The dom0 would still need to divide the processing time between machines which would end up slowing things down. > # We can manage disk more easily > > Isolation of disk between virtual servers is a benefit as it reduces the > damage a full disk would cause (our monitoring alerts us well before that > happens though!). This is largely a moot point. On a standalone server, you would have at *least* a RAID 1 setup to handle disk failures. If a disk did fail, you'd get a new one in place and off you go. The "disk" on a domU *will* *not* *fail*. Period. You might get a failed disk on your dom0, but the dom0 should either be talking to a SAN or be running a RAID 10 array, either of which would give you very good protection. Keep in mind that your filesystem and your disks are two distinct entities. While the disk on a domU won't fail, you could get a corrupt filesystem. But FS corruption is usually caused by a combination of physical problems with the disk and uncontrolled system shutdowns. The former is completely eliminated in a domU, and the latter is made extremely unlikely since your domU won't just "lose power". > # Security > > We expose nginx to the world, but our app servers, database, etc are on > private IPs. This reduces our exposure to attack. Also, putting different > services on different (virtual) servers means that if one service have an > vulnerability it doesn't necessarily mean that other others will be > vulnerable. If an exploit is found for your mailserver you don't want that > to mean your database gets owned. Specifically in regards to a mail server, unless your application needs to *receive* email directly, you have no business exposing it to the outside world in the first place. Every domU you set up should have an MTA installed (to allow things like ActionMailer to work seamlessly using :sendmail, for example), *but* that MTA should *never* accept connections from the outside world. It should only send email *out*. If you want to receive email from customers, go with a dedicated email setup where things like security are handled for you. Google Apps for domains is an excellent solution for this. If your application actually needs to receive emails then roll up your sleeves and be ready to waste a lot of time. Configuring a secure mailserver with a proper filtering setup takes a *lot* of time. That's why it should be avoided whenever possible. But to address the overall statement, once it is time to scale out to multiple domUs, you should divide them based on their role, not necessarily the software that they're running. For example, it's perfectly valid to have nginx *and* a mongrel cluster running on one machine, since working together they provide a full site. (nginx for static content, and the mongrels for dynamic content) You could, optionally, have another standalone machine running just nginx as a load balancer between the webservers. > There are a few things that I've still not achieved: > > # Migrating a live server from one physical server to another > > This requires a shared filesystem. I couldn't get GFS working on Ubuntu. I'm > wary of NFS as I hear it can be unstable under high load. I'd love to have > it but I'm still waiting for the HOWTO to be written. I would never run the filesystem for a domU off of NFS or any other network-based filesystem. What we do is share the filesystem by providing shared block devices over iSCSI. You can export individual LVs as iSCSI targets and connect to them via an initiator on your remote machine. Using a dedicated SAN is obviously a preferable solution, but if that's not an option this gives you a viable workaround. > # Sharing a filesystem between slices > > Same as above. This would be needed to allow separate app servers to handle > fileuploads. Considering what Fernando was actually trying to do here, my suggestion would be to not separate things out too much. In the case described, if you really *need* to have everything on a separate domU, I'd do a setup similar to what I outlined above, and set up one domU as your load balancer running nginx, and then set up each of your application domUs with nginx to allow them to serve static content from the application while still passing requests through to Thin. (or, optionally, design your application so that the static content isn't even part of your application, but a separate component that just gets deployed to your web server). > In response to your drawbacks: > > - Xen is an absolute pain to setup > > Once you know what you're doing it's pretty easy but it takes some learning. > Possibly not worth the time if you only have one app and don't need it to > scale. I agree with Mike here. I do this stuff for a living, so setting up a new dom0 takes me about an hour including the full OS install. But if it's your first time doing this, and if it's likely to be your only time, it would be a good idea to look at a Rails-specific managed hosting setup. We have a number of customers that need more in the way of resources than our standard hosting products provide, so we do managed hosting for them. They buy the hardware and ship it to us, and we handle the rest. The Rails community in general has been in a position where a developer needs to be a sysadmin as well, and this just isn't right. We have enough smart people out there that developers should be able to focus on developing applications, and admins should keep those applications running. With that said, if you're going to be the admin supporting a team of developers, every minute spent with getting familiar with Xen is a minute well spent. > - Setting up each VM is tedious > > Check out xen-tools for creating the VMs. Check out deprec automating your > setup. I couldn't agree more! :) xen-tools is definitely the way to go. I've spent quite a bit of time tweaking our setup to get to this point, but it currently takes me about 2 minutes to get a new domU fully set up and running. About 30 seconds of that is me typing in a few parameters, and about 1:30 of that is the dom0 actually building the image. If you're going to be setting up a lot of domU's, this is definitely the way to go. > - Nginx can't talk to Thin over Unix sockets, and moreover you need to > setup NFS to share static files between both See the above recommendation on how to handle this by using multiple nginx instances. > - you must setup the DB to be remotely accessible, and cannot use Unix > socket for communication > > You can make it remotely accessible on a private subnet that nginx can > access. I can't see any problem with this. Again, if you don't yet *need* to have your application split out to multiple domUs, don't. Just let it run on the same machine and let it use sockets. However, once it comes time to move these responsibilities out to individual domUs, Mike's suggestion is perfectly valid. You should always have a private subnet for your VM's to communicate over. > - actually upgrading the server is not easy as moving the VMs on another > box, there will still be a good downtime. > > Yep, putting more RAM into the physical server requires me to schedule > downtime for all the slices. I was sold on the "live migration" but it's > never become a reality for me (yet). Still, it's no worse than without > virtualization. I've learnt to fill it with RAM when we buy it. :-) Without a SAN, you're going to have downtime. One solution for minimizing downtime is using an rsync program that can handle block devices (the stock rsync can't do this). Then what you do is a copy of the disk while the machine is running (This is obviously going to be an invalid copy). Then you take the machine offline, do a quick sync (usually < 5 minutes) and bring the domU back up on the new machine. If the machines you're moving between have an identical hardware/ software setup (CPU and Xen version must be absolutely identical) you can shave off a few seconds by pausing the domU and copying it over to the new server during the sync. (the image will be only as big as the amount of RAM on the domU) This way you avoid the shutdown/startup delay. But unless you have processes that absolutely *must not* be stopped, you're much better off doing the regular shutdown/startup sequence. > Xen is cool. It can be useful. It may be that your needs are not currently > great enough to justify the investment of time getting to know it though. > > I'd love to hear from anyone who's got live migration working with Xen on > Ubuntu or Debian. We do them on a regular basis to balance out the load on our dom0s. All of our dom0s and domUs are running a (mostly) standard Debian install. The only real trick is setting up shared storage. Ideally, you'll have a real SAN to work with, but failing that setting up your dom0s as iSCSI initiators and targets works quite well. You can offload the processor and RAM usage to another machine with no downtime. -- Alex Malinovich Director of Deployment Services PLANET ARGON, LLC design // development // hosting http://www.planetargon.com http://www.the-love-shack.net [blog] +1 503 445 2457 +1 877 55 ARGON [toll free] +1 815 642 4068 [fax]
on 2008-10-24 13:30
Hi Alex and Mike, The day I posted this message I was pretty pissed off at Xen constantly putting sticks in my spokes. You are right, as for everything the first time you encounter a problem with Xen it takes ages to solve, afterward it's a matter of minutes to solve again. I'd like to hear success/horror stories about live migrating a database.
on 2008-10-24 20:49
On Oct 24, 2:30 am, Fernando P. <firstname.lastname@example.org> wrote: --snip-- > I'd like to hear success/horror stories about live migrating a database. Live migrating a domU running as a database server, or doing a live cutover of a replicated database?
on 2008-10-28 20:46
Yes for instance, I would be very interested in your experience.
on 2008-10-29 15:57
Thanks Alex, Your post got me thinking. I found some time (and a couple of dev servers!) to have another attempt at live migration of slices between hosts. Success! <For those unfamiliar with Xen: 'domU' means the same as 'slice'> I did a fresh install of Debian Etch (stable) on two servers and installed the Xen version that came with it (xen-3.0). I made the disk and swap images for the domU available via AoE on a third server. The live migration went smoothly. I was able to move the running domU from one host to the other without a noticable delay in an active file download or ssh session. One slight problem I noticed was when I migrated a domU without any sessions (download or otherwise) open to it. The Xen version that ships with Etch does't send a gratuitous arp which is required to notify other servers on the network that the domU has moved. When I migrated without any active sessions open to the host and pinged it from another host, the pings stopped during the migration. They started again when I pinged out from the migrated domU (via console). There's a known problem with the Xen version shipped with Etch (stable). Amazon ec2 seems to be using xen-3.1.0. What version are you using? The biggest question in my mind is how to protect against the same domU being manually started on two hosts. Mounting the same network block device (AoE) from two hosts will most likely corrupt the filesystem. Are you using some form of fencing? I also wonder whether it's safe to use lvm2 on a shared block device. I could create a volume group for domUs on a block device and have logical volumes for each domU. I imagine there would be risk of corruption in the volume management if more than one host were to create logical volumes. Perhaps it's possible to restrict volume creation to a single host and ensure that vgscan is run on all other hosts after each new volume is created. I've tried in the past to get clvm working on ubuntu and had no success. I suggest keeping it to no more than one domU per CPU > core unless you have a good reason to do otherwise. Interesting. I've not heard that sugestion before but I'm sure it comes from hard earned experience! :-) A number of our slices use very little CPU (mail, web redirectors) and we have some high availability hot standbys so we've found 12-16 slices on an 8 core server have been running OK. I'd like to investigate pinning slices to VCPUs to provide some protection for services that need it though. > > Depending on how many CPU cores you have, one > server consuming too much CPU time has the potential to adversely > affect other servers. I found this to be true when each slice ran Ubuntu's cron.daily tasks. Urgh! The 'nice' vaue means nothing to a server running within Xen. I dithered the times that cron.daily runs on slices but would love to know a better solution. (One slice per core would be one answer) > > > # We can manage disk more easily > > > > Isolation of disk between virtual servers is a benefit as it reduces the > > damage a full disk would cause (our monitoring alerts us well before that > > happens though!). > > This is largely a moot point. On a standalone server, you would have > at *least* a RAID 1 setup to handle disk failures. I was referring to "no space left on device" errors. We got one the other day when Starling filled a disk. Only that slice felt it. I agree with you about the RAID though. I've never had a disk failure but I've never regretted the money my clients invest in RAID cards. Thanks again for sharing your experiences Alex. Xen experts seem to be a bit scarce. I've been hearing that kvm is 'the next big thing' but discovering Amazon ec2 runs on Xen has strengthened my view that Xen is still a great choice for virtualization. - Mike