Joe,
As Ezra said, you can go a long way with two desktop-class systems:
one running Web+app and the other running the database.
Once you have that setup in place, there’s typically a lot you can do
to squeeze extra performance out of it. Both MySQL and (particularly)
Postgres allow for lots of performance enhancements over and above
their default configs; I’ve regularly squeezed 10x performance
improvements out of Postgres simply by following the tuning guidelines
you can find at the Postgres Web site. I think the default Postgres
config was put in place in the days when 300MHz, 64Mb machines were
leading edge gear, and hasn’t been updated for years.
Beyond that, and still on the database, you can optimise your
MySQL/Postgres indexes to cover the most common data searches; you can
select your table types in MySQL to give you the appropriate tradeoff
between speed and data integrity; you can implement partial indexes in
Postgres that dramatically improve the performance of specific, common
searches, particularly if you’re doing AJAX-y type progressive matches
on text data; you can split your data into separate tablespaces under
Postgres to improve performance. There’s a whole lot of stuff you
can do to make these two databases perform faster. Beyond that, you
can re-implement some of your Rails SQL as stored procedures or views,
which can give you a significant performance improvement in specific
cases.
On the app side, you can generally cache a lot of static content. One
thing to watch for is that Web server threads can get locked up
trickle-feeding data to Web browsers over slow links; if you can have
the Web server dump that data to e.g. squid, and let squid trickle
feed it to the Web browser, your Web server threads will be freed up
much much sooner and each thread will be able to process new incoming
requests. If you do the maths about how long it takes to send a Web
page full of data to a Web browser over a 56k dialup link, you’ll
realise that a Web server thread could be locked up for >10 seconds
quite easily.
This stuff is all documented if you search for it, and generally not
hard to implement. As others on this list have said repeatedly,
scalability on LAMP is a solved problem these days. The key is to get
a working environment in place, profile it, hunt down the bottlenecks
and start to address them. You need to define what your workload will
look like, then have a way of simulating it, plus you need to have a
way of measuring performance and identifying bottlenecks in your
architecture. Once you’ve got the bottlenecks identified, you can
start to knock them on the head until your performance reaches an
acceptable level; it will reach a point of diminishing returns, where
you’re eventually spending lots of effort to squeeze out that last
1-2% improvement, so you need to be able to define a cutoff point
where things are “fast enough”.
You can get a long way with a simple 2-system setup, but you need to
do your homework to get the most out of it.
…And yes, I do this stuff for a living ;->
Regards
Dave M.