Who uses a virtualized server witth separated VMs?

fipa · October 22, 2008, 12:44am

Hi,

Following Ezra’s advice in his book “rails deployment”, I have decided
to go through all the pain of setting up a server virtualized with Xen,
and separate each layer of the application in its own VM.

So I have 4 VMs: Nginx, Thin, PostgreSQL and one for email

The advantages are:

if a VM crashes, the others are still alive
I can setup a testing VM, mess things up, compile, break stuff without
any fear
I can upgrade the server quickly by moving the VMs on another box

The drawbacks:

Xen is an absolute pain to setup
Setting up each VM is tedious
Nginx can’t talk to Thin over Unix sockets, and moreover you need to
setup NFS to share static files between both
you must setup the DB to be remotely accessible, and cannot use Unix
socket for communication
you have to manage keeping each VM updated
actually upgrading the server is not easy as moving the VMs on another
box, there will still be a good downtime.

Each time you have more steps to do compared to a single server with all
the software running directly on a single OS.

Well separating each VM like I did is I find a bad idea, everything is
just more painful.

How do you setup your app? Single server VS multiple?

fipa · October 22, 2008, 7:28am

Hi Fernando,

I’ve been using Xen for almost a year now and we’ve been really happy
with
it. We have >50 virtual servers running on three physical servers.

Isolating the servers from one another provides numerous benefits:

One server consuming too much CPU or RAM will not adversely affect the

others

We don’t want an app that chews up RAM to affect other services (like
db).
We allocate a certain amount of RAM to each slice and can change the
allocation when required.

Nightly backups chew up CPU but our backup slice doesn’t affect the
other
slices running on the same server as CPU is shared fairly*.

We can manage disk more easily

Isolation of disk between virtual servers is a benefit as it reduces the
damage a full disk would cause (our monitoring alerts us well before
that
happens though!).

Security

We expose nginx to the world, but our app servers, database, etc are on
private IPs. This reduces our exposure to attack. Also, putting
different
services on different (virtual) servers means that if one service have
an
vulnerability it doesn’t necessarily mean that other others will be
vulnerable. If an exploit is found for your mailserver you don’t want
that
to mean your database gets owned.

There are a few things that I’ve still not achieved:

Migrating a live server from one physical server to another

This requires a shared filesystem. I couldn’t get GFS working on Ubuntu.
I’m
wary of NFS as I hear it can be unstable under high load. I’d love to
have
it but I’m still waiting for the HOWTO to be written.

Sharing a filesystem between slices

Same as above. This would be needed to allow separate app servers to
handle
fileuploads.

In response to your drawbacks:

Xen is an absolute pain to setup

Once you know what you’re doing it’s pretty easy but it takes some
learning.
Possibly not worth the time if you only have one app and don’t need it
to
scale.

Setting up each VM is tedious

Check out xen-tools for creating the VMs. Check out deprec automating
your
setup.

Nginx can’t talk to Thin over Unix sockets, and moreover you need to
setup NFS to share static files between both

nginx talks to mongrel over http. I don’t know anything about thin.

you must setup the DB to be remotely accessible, and cannot use Unix
socket for communication

You can make it remotely accessible on a private subnet that nginx can
access. I can’t see any problem with this.

you have to manage keeping each VM updated

When you automate your processes, updating 10 servers is no more work
than
updating 1. Plus you have record of what you’ve done and it’s
repeatable.
Putting the processes in place requires time and thought though.

actually upgrading the server is not easy as moving the VMs on another
box, there will still be a good downtime.

Yep, putting more RAM into the physical server requires me to schedule
downtime for all the slices. I was sold on the “live migration” but it’s
never become a reality for me (yet). Still, it’s no worse than without
virtualization. I’ve learnt to fill it with RAM when we buy it.

Xen is cool. It can be useful. It may be that your needs are not
currently
great enough to justify the investment of time getting to know it
though.

I’d love to hear from anyone who’s got live migration working with Xen
on
Ubuntu or Debian.

Mike

Yes, you can use ‘nice’ on a single server to manage CPU usage.

On Wed, Oct 22, 2008 at 9:44 AM, Fernando P. <

fipa · October 23, 2008, 12:18am

Mike raises some very good points (and a few that I don’t agree with
as well ), about using the virtualized setup. One important thing
to keep in mind is that being able to scale shouldn’t cause you to
“over-scale” at the beginning. Start small and work your way up. Since
both messages are included here, I’ve got replies going to two
different people in some cases. Sorry for any confusion.

I’ve been using Xen for almost a year now and we’ve been really happy with
it. We have >50 virtual servers running on three physical servers.

Unless you have a lot of CPU cores and/or a REALLY fast SAN, you’re
overloading your hardware. While you can put a virtually unlimited
number of domU’s on a dom0, it doesn’t necessarily mean that you
should. I suggest keeping it to no more than one domU per CPU
core unless you have a good reason to do otherwise.

One server consuming too much CPU or RAM will not adversely affect the

others

We don’t want an app that chews up RAM to affect other services (like db).
We allocate a certain amount of RAM to each slice and can change the
allocation when required.

See the statement above. Depending on how many CPU cores you have, one
server consuming too much CPU time has the potential to adversely
affect other servers. If it’s financially feasible, it is much
preferable to have at least one dedicated CPU core to each domU.

Nightly backups chew up CPU but our backup slice doesn’t affect the other
slices running on the same server as CPU is shared fairly*.

This is definitely an example of a good setup. We have a few
applications that run some extremely CPU-intensive rake tasks, and
those are run on separate domU’s so that the primary application won’t
get bogged down. Note that you could accomplish the same thing by just
tying processes on one domU to a particular processor, but that takes
additional work. As a side note, though, if your dom0 has only one or
two cores, the above wouldn’t help you. The dom0 would still need to
divide the processing time between machines which would end up slowing
things down.

We can manage disk more easily

Isolation of disk between virtual servers is a benefit as it reduces the
damage a full disk would cause (our monitoring alerts us well before that
happens though!).

This is largely a moot point. On a standalone server, you would have
at least a RAID 1 setup to handle disk failures. If a disk did fail,
you’d get a new one in place and off you go. The “disk” on a domU
will not fail. Period. You might get a failed disk on your dom0,
but the dom0 should either be talking to a SAN or be running a RAID 10
array, either of which would give you very good protection.

Keep in mind that your filesystem and your disks are two distinct
entities. While the disk on a domU won’t fail, you could get a corrupt
filesystem. But FS corruption is usually caused by a combination of
physical problems with the disk and uncontrolled system shutdowns. The
former is completely eliminated in a domU, and the latter is made
extremely unlikely since your domU won’t just “lose power”.

Security

We expose nginx to the world, but our app servers, database, etc are on
private IPs. This reduces our exposure to attack. Also, putting different
services on different (virtual) servers means that if one service have an
vulnerability it doesn’t necessarily mean that other others will be
vulnerable. If an exploit is found for your mailserver you don’t want that
to mean your database gets owned.

Specifically in regards to a mail server, unless your application
needs to receive email directly, you have no business exposing it to
the outside world in the first place. Every domU you set up should
have an MTA installed (to allow things like ActionMailer to work
seamlessly using :sendmail, for example), but that MTA should
never accept connections from the outside world. It should only send
email out. If you want to receive email from customers, go with a
dedicated email setup where things like security are handled for you.
Google Apps for domains is an excellent solution for this. If your
application actually needs to receive emails then roll up your sleeves
and be ready to waste a lot of time. Configuring a secure mailserver
with a proper filtering setup takes a lot of time. That’s why it
should be avoided whenever possible.

But to address the overall statement, once it is time to scale out to
multiple domUs, you should divide them based on their role, not
necessarily the software that they’re running. For example, it’s
perfectly valid to have nginx and a mongrel cluster running on one
machine, since working together they provide a full site. (nginx for
static content, and the mongrels for dynamic content) You could,
optionally, have another standalone machine running just nginx as a
load balancer between the webservers.

There are a few things that I’ve still not achieved:

Migrating a live server from one physical server to another

This requires a shared filesystem. I couldn’t get GFS working on Ubuntu. I’m
wary of NFS as I hear it can be unstable under high load. I’d love to have
it but I’m still waiting for the HOWTO to be written.

I would never run the filesystem for a domU off of NFS or any other
network-based filesystem. What we do is share the filesystem by
providing shared block devices over iSCSI. You can export individual
LVs as iSCSI targets and connect to them via an initiator on your
remote machine. Using a dedicated SAN is obviously a preferable
solution, but if that’s not an option this gives you a viable
workaround.

Sharing a filesystem between slices

Same as above. This would be needed to allow separate app servers to handle
fileuploads.

Considering what Fernando was actually trying to do here, my
suggestion would be to not separate things out too much. In the case
described, if you really need to have everything on a separate domU,
I’d do a setup similar to what I outlined above, and set up one domU
as your load balancer running nginx, and then set up each of your
application domUs with nginx to allow them to serve static content
from the application while still passing requests through to Thin.
(or, optionally, design your application so that the static content
isn’t even part of your application, but a separate component that
just gets deployed to your web server).

In response to your drawbacks:

Xen is an absolute pain to setup

Once you know what you’re doing it’s pretty easy but it takes some learning.
Possibly not worth the time if you only have one app and don’t need it to
scale.

I agree with Mike here. I do this stuff for a living, so setting up a
new dom0 takes me about an hour including the full OS install. But if
it’s your first time doing this, and if it’s likely to be your only
time, it would be a good idea to look at a Rails-specific managed
hosting setup. We have a number of customers that need more in the way
of resources than our standard hosting products provide, so we do
managed hosting for them. They buy the hardware and ship it to us, and
we handle the rest.

The Rails community in general has been in a position where a
developer needs to be a sysadmin as well, and this just isn’t right.
We have enough smart people out there that developers should be able
to focus on developing applications, and admins should keep those
applications running. With that said, if you’re going to be the admin
supporting a team of developers, every minute spent with getting
familiar with Xen is a minute well spent.

Setting up each VM is tedious

Check out xen-tools for creating the VMs. Check out deprec automating your
setup.

I couldn’t agree more! xen-tools is definitely the way to go. I’ve
spent quite a bit of time tweaking our setup to get to this point, but
it currently takes me about 2 minutes to get a new domU fully set up
and running. About 30 seconds of that is me typing in a few
parameters, and about 1:30 of that is the dom0 actually building the
image. If you’re going to be setting up a lot of domU’s, this is
definitely the way to go.

Nginx can’t talk to Thin over Unix sockets, and moreover you need to
setup NFS to share static files between both

See the above recommendation on how to handle this by using multiple
nginx instances.

you must setup the DB to be remotely accessible, and cannot use Unix
socket for communication

You can make it remotely accessible on a private subnet that nginx can
access. I can’t see any problem with this.

Again, if you don’t yet need to have your application split out to
multiple domUs, don’t. Just let it run on the same machine and let it
use sockets. However, once it comes time to move these
responsibilities out to individual domUs, Mike’s suggestion is
perfectly valid. You should always have a private subnet for your VM’s
to communicate over.

actually upgrading the server is not easy as moving the VMs on another
box, there will still be a good downtime.

Yep, putting more RAM into the physical server requires me to schedule
downtime for all the slices. I was sold on the “live migration” but it’s
never become a reality for me (yet). Still, it’s no worse than without
virtualization. I’ve learnt to fill it with RAM when we buy it.

Without a SAN, you’re going to have downtime. One solution for
minimizing downtime is using an rsync program that can handle block
devices (the stock rsync can’t do this). Then what you do is a copy of
the disk while the machine is running (This is obviously going to be
an invalid copy). Then you take the machine offline, do a quick sync
(usually < 5 minutes) and bring the domU back up on the new machine.
If the machines you’re moving between have an identical hardware/
software setup (CPU and Xen version must be absolutely identical) you
can shave off a few seconds by pausing the domU and copying it over to
the new server during the sync. (the image will be only as big as the
amount of RAM on the domU) This way you avoid the shutdown/startup
delay. But unless you have processes that absolutely must not be
stopped, you’re much better off doing the regular shutdown/startup
sequence.

Xen is cool. It can be useful. It may be that your needs are not currently
great enough to justify the investment of time getting to know it though.

I’d love to hear from anyone who’s got live migration working with Xen on
Ubuntu or Debian.

We do them on a regular basis to balance out the load on our dom0s.
All of our dom0s and domUs are running a (mostly) standard Debian
install. The only real trick is setting up shared storage. Ideally,
you’ll have a real SAN to work with, but failing that setting up your
dom0s as iSCSI initiators and targets works quite well. You can
offload the processor and RAM usage to another machine with no
downtime.

–
Alex Malinovich
Director of Deployment Services

PLANET ARGON, LLC
design // development // hosting

http://www.the-love-shack.net [blog]

+1 503 445 2457
+1 877 55 ARGON [toll free]
+1 815 642 4068 [fax]

fipa · October 24, 2008, 6:49pm

On Oct 24, 2:30 am, Fernando P. [email protected]
wrote:
–snip–

I’d like to hear success/horror stories about live migrating a database.

Live migrating a domU running as a database server, or doing a live
cutover of a replicated database?

fipa · October 28, 2008, 7:46pm

Yes for instance, I would be very interested in your experience.

fipa · October 24, 2008, 11:30am

Hi Alex and Mike,

The day I posted this message I was pretty pissed off at Xen constantly
putting sticks in my spokes.

You are right, as for everything the first time you encounter a problem
with Xen it takes ages to solve, afterward it’s a matter of minutes to
solve again.

I’d like to hear success/horror stories about live migrating a database.

fipa · October 29, 2008, 2:57pm

Thanks Alex,

Your post got me thinking. I found some time (and a couple of dev
servers!)
to have another attempt at live migration of slices between hosts.
Success!

I did a fresh install of Debian Etch (stable) on two servers and
installed
the Xen version that came with it (xen-3.0). I made the disk and swap
images
for the domU available via AoE on a third server. The live migration
went
smoothly. I was able to move the running domU from one host to the other
without a noticable delay in an active file download or ssh session.

One slight problem I noticed was when I migrated a domU without any
sessions
(download or otherwise) open to it. The Xen version that ships with Etch
does’t send a gratuitous arp which is required to notify other servers
on
the network that the domU has moved. When I migrated without any active
sessions open to the host and pinged it from another host, the pings
stopped
during the migration. They started again when I pinged out from the
migrated
domU (via console). There’s a known problem with the Xen version shipped
with Etch (stable). Amazon ec2 seems to be using xen-3.1.0. What version
are
you using?

The biggest question in my mind is how to protect against the same domU
being manually started on two hosts. Mounting the same network block
device
(AoE) from two hosts will most likely corrupt the filesystem. Are you
using
some form of fencing?

I also wonder whether it’s safe to use lvm2 on a shared block device. I
could create a volume group for domUs on a block device and have logical
volumes for each domU. I imagine there would be risk of corruption in
the
volume management if more than one host were to create logical volumes.
Perhaps it’s possible to restrict volume creation to a single host and
ensure that vgscan is run on all other hosts after each new volume is
created. I’ve tried in the past to get clvm working on ubuntu and had no
success.

I suggest keeping it to no more than one domU per CPU

core unless you have a good reason to do otherwise.

Interesting. I’ve not heard that sugestion before but I’m sure it comes
from
hard earned experience!

A number of our slices use very little CPU (mail, web redirectors) and
we
have some high availability hot standbys so we’ve found 12-16 slices on
an 8
core server have been running OK. I’d like to investigate pinning slices
to
VCPUs to provide some protection for services that need it though.

Depending on how many CPU cores you have, one
server consuming too much CPU time has the potential to adversely
affect other servers.

I found this to be true when each slice ran Ubuntu’s cron.daily tasks.
Urgh!
The ‘nice’ vaue means nothing to a server running within Xen. I dithered
the
times that cron.daily
runs on slices but would love to know a better solution. (One slice per
core
would be one answer)

We can manage disk more easily

Isolation of disk between virtual servers is a benefit as it reduces the
damage a full disk would cause (our monitoring alerts us well before that
happens though!).

This is largely a moot point. On a standalone server, you would have
at least a RAID 1 setup to handle disk failures.

I was referring to “no space left on device” errors. We got one the
other
day when Starling filled a disk. Only that slice felt it.

I agree with you about the RAID though. I’ve never had a disk failure
but
I’ve never regretted the money my clients invest in RAID cards.

Thanks again for sharing your experiences Alex. Xen experts seem to be a
bit
scarce. I’ve been hearing that kvm is ‘the next big thing’ but
discovering
Amazon ec2 runs on Xen has strengthened my view that Xen is still a
great
choice for virtualization.

Mike