Rafael S. wrote:
I am in the design process of porting a fairly large client/server app
to rails. Average data set is about 200 MB per database/server.
Altogether there are about 100 tables and about 500 stored procedures.
Obviously this is a complete rewrite of the app, but while I am at it
I might as well solve some outstanding issues. Anyhow here are the
Before you go farther, think about how you will write unit tests for all
For example, in some circumstances, I would write tests on the
original system, and then port the tests as I port the code. I’m aware
that myriad cultural and technical issues might conflict with that
goal. But your job now is to extract as many latent business rules
from the existing system as you can, so don’t go overlooking any.
Next, why are you rewriting? A noted process guru, either Ward
Cunningham or Martin F., recommends a “strangler fig” strategy. If
you look up the lifecycle of a strangler fig you’ll understand
immediately. You should ask the customer what’s the most important
feature to add to the existing system, and you add it using Rails.
Then you also add a few easy features that replace the existing system
- “for free”. Repeat like this, ostensibly adding requested features
with replacement features (and releasing them) until your fig
completely covers and strangles this obsolete tree.
Your current plan expects to work a very long time without any
releases - that is a super-bad strategy because the longer you go, the
higher the risk of a mismatch between what you write and what users
- Has anybody have any experience with very large databases
Yes. Databases are screwy specifically because they enable huge data
sets, so I know if I use ActiveRecord correctly, and add a few foreign
keys to speed up queries, I should be safe.
If this is important, some of your early unit tests should deal with
ten billion records, to see what happens. (I would give such a test
batch a “fast mode”, to only deal with ten records for most runs. I
would run the “slow mode” only after upgrading the database".)
You need to learn “migrations”, because in your case you might find
yourself actually migrating data out of the old database and into the
new one. Alternately, you could simply point your database.yml at the
old database, and set its table names in ActiveRecord explicitly.
Next, always test your models have a working destroy option. (Note
that some databases should never destroy records, and should use
acts_as_versioned instead.) Either way, destroy will fail if a foreign
key would break, so add lots of unit tests that destroy things, as you
add tests and features that construct things.
Did you use one database per client or multiple “client data
sets” in one database?
I would do the simplest thing that could possibly work here. Databases
are designed to index and cross-ref arbitrarily huge data blocks, so
if the engine doesn’t care about multiple customer databases, then I
- If I decide to go with multiple client datasets stored in one
database is there an easy way for active record to limit result sets
by login id or client id?
Read /Rails Recipes/ (it covers some of your other questions), and
then use with_scope(:find) to run a set of find(:all) command across
one database subset, not :all.
Obviously it would make a large difference in development time if you
could use something like Customer.find_by_name vs Customer.find
(search by name and loginid or some other restriction to the dataset).
Maybe. I have found that unit testing has a much greater influence on
development time. For example, if I want to write a cheap simple
command, but don’t know if it will work as well as a long crufty
command, I can run the tests and see.
I know this example is trivial but there will be a lot of queries to
the database that will have to restrict the data based on who is
logged in and this will all add up.
That is /Rails Recipes’s/ exact scenario for with_scope.
- And finally a question about REST. Does anybody know if the
original philosophy of developing apps in rails (controller/action/id)
will be deprecated and eliminated in future versions of Rails in favor
Balderdash. controller/action/id IS the simplest form of REST. It’s
here to stay.
I have dabbled with some restful controllers and to be
truthful I prefer the original way. There are some scenarios where I
will use REST but I would hate to be forced to do the complete app in
a restful way.
What URLs would your customers like?
http://c2.com/cgi/wiki?ZeekLand <-- NOT a blog!!