Apache-Mongrel-Rails-SQLite-Multithreading

Hi.

I have read that using SQLite for a high volume web site is not advised
because of the single-threaded, one-at-a-time nature of SQLite and
Rails. They say that it is better to use a multithreaded database server
based database such as MySQL, PostgreSQL, etc. I assume that this is to
be able to “concurrently” handle large number of database access
requests.

However there is a new configuration that uses Apache 2.x (with
mod_load_balancer) to forward requests to a cluster of Mongrels (a Ruby
web server for rails), which then accesses the database through the
Rails application. My understanding is that Apache 2.x is multithreaded.

My question is…if Apache is going to handle the multithreadedness of
the applications, why does the database also have to be multithreaded
for high-volume applications? For example, suppose that I have an
application with Apache 2.x, mod_load_balancer, Mongrel, Rails and my
appilcation and using an SQLite database, what is the compelling reason
to go for this versus MySQL, given that Apache 2.x can handle multiple
web requests at the same time?

For the question above, if there is no good multithreading-related
reason to go to a server-based database, then will I still get
multthreadedness in some degree if I choose to have Apache forward
requests to only one Mongrel/Rails/myApp/SQLite cluster?

Curiously, Jay

Jay,

If any component of the entire application stack is single threaded,
it will become a bottleneck. In the architecture you’ve proposed,
Apache could handle multiple requests simultaneously & then would
pass them onto mod_load_balancer, which in turn passes the request
onto a given instance of Mongrel. When Mongrel then begins to
execute it’s request, it asks SQLite. If SQLite is singlethreaded,
then it must complete it’s query before another mongrel instance can
query it. This means the multi-threaded nature of Apache &
mod_load_balancer is effectively negated as you are database bound.
Although your Apache aspect is multithreaded & you may see some
improvements in capacity for any static content (served by Apache)
and any rails requests which don’t hit the database, it won’t
magically improve the performance of database-bound requests.

Keep in mind that Mongrel is also single threaded, thus the reason
for mod_load_balancer, which hands off requests to multiple instances
of Mongrel (sort of like how Apache 1.3 worked with multiple
processes). If you use a single instance of Mongrel behind
mod_load_balancer, it is providing zero benefit & perhaps due to the
load-balancer overhead might reduce your capacity. Your final
question sounded like that was what you intended.

Niels

Niels Meersschaert wrote the following on 26.02.2007 16:52 :

Keep in mind that Mongrel is also single threaded,

Just to be sure. I believed that Mongrel was multi-threaded, but that
the rails interface was protected by a mutex or the equivalent? Which
means that you could serve multiple requests by a single Mongrel process
as long as they don’t involve Rails (static files, Merb, the Mongrel
upload plugin and so on…).
In fact multiple uploads handled by a single Mongrel process could
matter to me.

Lionel

Hi Niels.

Your argument makes sense. However, let me extend your argument. I
know that MySQL stores data in one file per table, as opposed to
SQLite which uses one file for the whole database. With MySQL, even
though the database server is multithreaded, access to the database
file is “single-threaded” just as with SQLite…i.e. In both cases, we
are file-bound (assuming that the query has multiple requests that
amount to doing a fopen/fread/fclose on the same file), although I
would expect better concurrent performance from MySQL for multiple
queries on different tables.

I guess the core of my argument is that given that we are file-bound
in both cases, what does two cascading multithreading systems (Apache
and MySQL database server) buy us versus only one (Apache)?

You guessed correctly at the end which is that I would like to use
Apache to front-end my single threaded Mongrel/Rails/SQLite
application. The main motivation for this is to use its free
multithreading from Apache.

Cheers, Jay

Hi Lionel.

I think you have best described the core of the problem, and that is
the one regarding the granularity of locking. It seems to me that both
MySQL and SQLite use file-level locking, but MySQL uses one file per
table instead of one file for the whole database as SQLite does.

However, the core question remains (slightly modified). If I have a
web application that needs to be multithreaded to support concurrent
access to the database file (whether in MySQL or SQLite), what is the
advantage of having two cascading multithreaded entities (Apache and
the MySQL database server) in front of the lock-bound database file,
versus having only one multithreaded entity (Apache) fronting the
SQLite database file?

Curiously, Jay

P.S. My understanding is that SQLite supports one concurrent writer
and multiple concurrent readers.

On Feb 26, 11:41 am, Lionel B. [email protected]

Jay Godse wrote the following on 26.02.2007 17:17 :

queries on different tables.

I guess the core of my argument is that given that we are file-bound
in both cases, what does two cascading multithreading systems (Apache
and MySQL database server) buy us versus only one (Apache)?

The problem is not exactly the threading in itself but the amount of
locking done on the database. The more fine-grained locking your
database system can do, the less it will get in the way of multiple
clients accessing the database for different needs (which aren’t
actually slowed down by disks but by locks waiting for clients to finish
a transaction which might imply heavy processing on the clients to be
released). SQLite was originally designed with a single client in mind,
no concurrent access whatsoever. It introduced multiple simultaneous
clients along the way but it probably as a long catch-up to do if it
wants to rival full-fledged RDBMS like MySQL or PostgreSQL. I don’t
think SQLite tries to address multiple clients performance needs so I
wouldn’t hold my breath…

I believe PostgreSQL to have theoritically the best concurrent access
involving writes and transactions (if compared to InnoDB, MyISAM is
another matter). Unfortunately benchmarks are lacking to verify it, if
someone knows of transactional benchmarks comparing MySQL with InnoDB
and PostgreSQL, I’d like to know.

Lionel.

At least with InnoDB, Mysql has row level locking. All of your your
actual data is in just a handful of files, not split by table (in
mysql/data/ibdata1, mysql/data/ibdata2 etc…). No file level locking
going around there.

SQL lite can have multiple concurrent readers or 1 concurrent writer,
not both. If anyone, anywhere is writing a change to the database, then
absolutely everyone else has to wait for that update to complete before
being able to do anything at all.

For most uses that is not an acceptable situation.

Fred

On 2/26/07, Lionel B. [email protected] wrote:

I believe PostgreSQL to have theoritically the best concurrent access
involving writes and transactions (if compared to InnoDB, MyISAM is
another matter). Unfortunately benchmarks are lacking to verify it, if
someone knows of transactional benchmarks comparing MySQL with InnoDB
and PostgreSQL, I’d like to know.

Not sure how/what these guys are testing, but their graphs do seem to
back your view:

But don’t pick postgres just because it’s faster. Use it because it’s
standards-compliant and well-behaved, unlike mysql.

Isak

Isak H. wrote the following on 27.02.2007 09:48 :

back your view:
Database test: dual Intel Xeon 5160 - Overview - Engelstalig - Tweakers
Database test: Sun UltraSparc T1 vs. AMD Opteron - MySQL vs. PostgreSQL - Engelstalig - Tweakers

Good, real-case scenarios with recent versions! They even say that their
SQL code is initially tuned for MySQL and there’s probably still room
for improvements…

But don’t pick postgres just because it’s faster. Use it because it’s
standards-compliant and well-behaved, unlike mysql.

That was my initial motivation for choosing PostgreSQL, trust me I know
a few of MySQL’s liberties with SQL and limitations from experience :slight_smile:
But I wanted to have an early peak (before performing benchmarks myself
on final code) at the concurrent access behaviour.

Thank you very much.

Lionel.