Question on application/database design for a application po

maunzinha · March 13, 2007, 7:34pm

Hi all

I am hoping that the experience and knowledge of this ‘list’ will be
able to help me out with some design decisions I have to make while
porting a desktop app to rails.

Here is the scoop:
I am in the design process of porting a fairly large client/server app
to rails. Average data set is about 200 MB per database/server.
Altogether there are about 100 tables and about 500 stored procedures.
Obviously this is a complete rewrite of the app, but while I am at it
I might as well solve some outstanding issues. Anyhow here are the
questions:

Has anybody have any experience with very large databases using
rails? Did you use one database per client or multiple “client data
sets” in one database?
If I decide to go with multiple client datasets stored in one
database is there an easy way for active record to limit result sets
by login id or client id?

Obviously it would make a large difference in development time if you
could use something like Customer.find_by_name vs Customer.find
(search by name and loginid or some other restriction to the dataset).

I know this example is trivial but there will be a lot of queries to
the database that will have to restrict the data based on who is
logged in and this will all add up.

And finally a question about REST. Does anybody know if the
original philosophy of developing apps in rails (controller/action/id)
will be deprecated and eliminated in future versions of Rails in favor
of REST? I have dabbled with some restful controllers and to be
truthful I prefer the original way. There are some scenarios where I
will use REST but I would hate to be forced to do the complete app in
a restful way.

Thanks, I would really appreciate any input.

Rafael

–
http://www.bdcsoftware.com" - Automotive CRM

maunzinha · March 13, 2007, 7:58pm

Rafael S. wrote:

I am in the design process of porting a fairly large client/server app
to rails. Average data set is about 200 MB per database/server.
Altogether there are about 100 tables and about 500 stored procedures.
Obviously this is a complete rewrite of the app, but while I am at it
I might as well solve some outstanding issues. Anyhow here are the
questions:

Before you go farther, think about how you will write unit tests for all
that.

For example, in some circumstances, I would write tests on the
original system, and then port the tests as I port the code. I’m aware
that myriad cultural and technical issues might conflict with that
goal. But your job now is to extract as many latent business rules
from the existing system as you can, so don’t go overlooking any.

Next, why are you rewriting? A noted process guru, either Ward
Cunningham or Martin F., recommends a “strangler fig” strategy. If
you look up the lifecycle of a strangler fig you’ll understand
immediately. You should ask the customer what’s the most important
feature to add to the existing system, and you add it using Rails.
Then you also add a few easy features that replace the existing system

“for free”. Repeat like this, ostensibly adding requested features
with replacement features (and releasing them) until your fig
completely covers and strangles this obsolete tree.

Your current plan expects to work a very long time without any
releases - that is a super-bad strategy because the longer you go, the
higher the risk of a mismatch between what you write and what users
need.

Has anybody have any experience with very large databases

Yes.

using rails?

Yes. Databases are screwy specifically because they enable huge data
sets, so I know if I use ActiveRecord correctly, and add a few foreign
keys to speed up queries, I should be safe.

If this is important, some of your early unit tests should deal with
ten billion records, to see what happens. (I would give such a test
batch a “fast mode”, to only deal with ten records for most runs. I
would run the “slow mode” only after upgrading the database".)

You need to learn “migrations”, because in your case you might find
yourself actually migrating data out of the old database and into the
new one. Alternately, you could simply point your database.yml at the
old database, and set its table names in ActiveRecord explicitly.

Next, always test your models have a working destroy option. (Note
that some databases should never destroy records, and should use
acts_as_versioned instead.) Either way, destroy will fail if a foreign
key would break, so add lots of unit tests that destroy things, as you
add tests and features that construct things.

Did you use one database per client or multiple “client data
sets” in one database?

I would do the simplest thing that could possibly work here. Databases
are designed to index and cross-ref arbitrarily huge data blocks, so
if the engine doesn’t care about multiple customer databases, then I
don’t.

If I decide to go with multiple client datasets stored in one
database is there an easy way for active record to limit result sets
by login id or client id?

Read /Rails Recipes/ (it covers some of your other questions), and
then use with_scope(:find) to run a set of find(:all) command across
one database subset, not :all.

Obviously it would make a large difference in development time if you
could use something like Customer.find_by_name vs Customer.find
(search by name and loginid or some other restriction to the dataset).

Maybe. I have found that unit testing has a much greater influence on
development time. For example, if I want to write a cheap simple
command, but don’t know if it will work as well as a long crufty
command, I can run the tests and see.

I know this example is trivial but there will be a lot of queries to
the database that will have to restrict the data based on who is
logged in and this will all add up.

That is /Rails Recipes’s/ exact scenario for with_scope.

And finally a question about REST. Does anybody know if the
original philosophy of developing apps in rails (controller/action/id)
will be deprecated and eliminated in future versions of Rails in favor
of REST?

Balderdash. controller/action/id IS the simplest form of REST. It’s
here to stay.

I have dabbled with some restful controllers and to be
truthful I prefer the original way. There are some scenarios where I
will use REST but I would hate to be forced to do the complete app in
a restful way.

What URLs would your customers like?

–
Phlip
http://c2.com/cgi/wiki?ZeekLand ← NOT a blog!!

maunzinha · March 13, 2007, 8:05pm

Has anybody have any experience with very large databases using
rails? Did you use one database per client or multiple “client data
sets” in one database?

This one isn’t for me.

If I decide to go with multiple client datasets stored in one

database is there an easy way for active record to limit result sets
by login id or client id?

Obviously it would make a large difference in development time if you
could use something like Customer.find_by_name vs Customer.find
(search by name and loginid or some other restriction to the dataset).

I know this example is trivial but there will be a lot of queries to
the database that will have to restrict the data based on who is
logged in and this will all add up.

You said it: Customer.find_by_name
Also:
Customer.find_by_name_and_id
Customer.find(:all, :conditions => [‘name = ? and id = ?’, name, id])
Also, the associations that you make in Rails allow you to restrict
stuff by
default.
So, firstly, if you want to find all the (just making this up) locations
belonging to a single user, you can do:
class User < ActiveRecord::Base
has_many :locations
end
class Location < ActiveRecord::Base
belongs_to :user
end
Then, when you type ‘user.locations’, it will automatically perform the
search as Location.find(:all, :conditions => [‘user_id = ?’, user_id])

And finally a question about REST. Does anybody know if the

original philosophy of developing apps in rails (controller/action/id)
will be deprecated and eliminated in future versions of Rails in favor
of REST? I have dabbled with some restful controllers and to be
truthful I prefer the original way. There are some scenarios where I
will use REST but I would hate to be forced to do the complete app in
a restful way.

The chances of this being completely deprecated in favor of REST is
very,
very small.
REST seems, at least to me, to be a good design principle, and it is
extremely good for machine to machine communcation, but there are times
when
violating REST, or just programming an app with a different design, can
be
desired or even required. That being said, as Rails is a “convention
over
configuration” type of framework, expect that they will be adding more
and
more shortcuts for developing REST in, if that’s the most common
usage…
just don’t expect them to remove the ability to do anything other than
REST.

maunzinha · March 13, 2007, 10:17pm

Philip,

thanks for your reply. To clarify a few things:

The reason for a port is that my customers are demanding a
web/online version of my application
I have always written everything in chunks and released in chunks,
so no plans to change there. In addition, KISS is my mantra
Of course I will use migrations for the db, although I might stick
ar_fixtures for data migration

Before you go farther, think about how you will write unit tests for all that.

Well, going with one database per client would simplify writing tests,
but I don’t think it will be the deciding factor. But I will have to
take a closer look at it anyhow since in the past I have neglected
some of the testing…

Yes. Databases are screwy specifically because they enable huge data
sets, so I know if I use ActiveRecord correctly, and add a few foreign
keys to speed up queries, I should be safe.

Yes they are, aren’t they? The current back-end for my desktop app is
Firebird which is a very fine database but also a very unforgiving. If
you write sloppy SQL it will kill your performance. I thought about
using it with AR, but the Firebird adapter is still fairly new and I
don’t know if it will do the job. Also some of the auto-created joins
by AR will never perform well with Firebird so it would require quite
extensive use of stored procs. Well, that leaves Postgre or MSSQL.
(MySQL is not an option for me)

Read /Rails Recipes/ (it covers some of your other questions), and
then use with_scope(:find) to run a set of find(:all) command across
one database subset, not :all
.
I looked up nested_with_scope plugin and it looks very promising. I
will have to do some performance tests and see the SQL output

Balderdash. controller/action/id IS the simplest form of REST. It’s
here to stay.

Let’s hope.

maunzinha · March 14, 2007, 9:36pm

This is just an fyi so that anybody searching the list archives gets
to see a solution to this problem. Anyhow, the key was with_scope
(thanks Phlip).

I have decided to store all data sets in one database and just append
an account_id to each table. This will allow for searching with
Account.Customers.find or Account.orders.find etc.

In addition, each model (except account) will have the following two
methods added:

def self.find(*args)
with_scope :find => {:conditions =>
[‘account_id=?’,‘123456789012345678901234567890XX’]} do
super
end
end

def self.method_missing(method, *args, &block)
with_scope :find => {:conditions =>
[‘account_id=?’,‘123456789012345678901234567890XX’]} do
super(method, *args, &block)
end
end

where the account id will be dynamic based on the user currently
logged in. These two methods will allow for searching by
Customer.find(:all) or Customer.find_by_last_name and still have the
result set filtered by account id.

I have not implemented the :create option for with_scope because each
of my models already has a before_create method implemented in an
helper that sets the id to a GUID so it was no brainer to add setting
of account id.

Anyhow, thanks for the help

Rafael

–
http://www.bdcsoftware.com" - Automotive CRM

maunzinha · March 13, 2007, 8:49pm

Rafeal,

see inline

Rafael S. wrote:

Obviously this is a complete rewrite of the app, but while I am at it

Obviously it would make a large difference in development time if you
could use something like Customer.find_by_name vs Customer.find
(search by name and loginid or some other restriction to the dataset).

I haven’t seen anything in RoR that will automatically restrict the
select set based on some criteria…so you will probably need to do it
yourself.

will use REST but I would hate to be forced to do the complete app in
a restful way.

You are able to configure how your URLs are routed to your controllers
and actions by modifying the application’s route.rb file. I don’t
believe that RoR will dictate that you should use RESTful URIs in your
application

cheers