Putting it all in one place with Schemas

sorenson · January 8, 2006, 5:41pm

I’m still relatively new to ROR, but I like what I see with the
database-neutral approach of Schemas. In the spirit of keeping it
simple and minimizing the number of files and location of information,
does ActiveState currently allow me to go ahead and place other low
level model information such as validation requirements and associations
inside the schema rather than putting it in a separate file which
specifies the “model”. In this schema-as-model scheme, ActiveState
could import this information into the model along with other schema
information and ActiveState models could automagically appear without
having to create separate model files. Of course additional model
functionality that isn’t so close to the DB could still be kept in a
separate model file.

sorenson · January 8, 2006, 6:27pm

Extending the example from the migrations movie:

ActiveRecord::Schema.define do

create_table “comments” do |t|
t.column “body”, :text
t.belongs_to :post, :counter_cache=>true
end

create_table “posts” do |t|
t.column “title”, :string
t.column “body”, :text
t.column “created_at”, :datetime
t.column “author_name”, :string
t.column “commments_count”, :integer, :default=>0
t.has_many : comments
t.has_many_and_belongs_to :topics
end

create_table “topics” do |t|
t.column “title”, :string
t.has_many_and_belongs_to :posts
end
end

Since tables already map directly to models, Active state could simple
import this additional low level information into Model classes which
could spring forth into existance without requiring a separate model
file. “t.belongs_to :post” would automatically create the appropriate
foreign key (post_id). “t.has_many_and_belongs_to :topics” would
automagically create the association table (a tedious task that should
not merit explicit keyboard time) if not already present. Any thoughts?

sorenson · January 9, 2006, 6:57am

Any thoughts?

I think you mean Active Record, but no, we don’t have any plans for
putting association definitions into the schema. The definitions
serves a grander purpose than just to define a foreign key. They’re
also a spot for adding in query details, like the order or the
conditions, to specify eager loading with :include, and to allow
multiple associations on the same foreign keys.

In other words, they allow for a ton of specialization and even logic
that’s a bad fit for inclusion in a schema file.

David Heinemeier H.
http://www.loudthinking.com – Broadcasting Brain
http://www.basecamphq.com – Online project management
http://www.backpackit.com – Personal information manager
http://www.rubyonrails.com – Web-application framework

sorenson · January 9, 2006, 11:55pm

On Jan 8, 2006, at 9:55 PM, David Heinemeier H. wrote:

In other words, they allow for a ton of specialization and even logic
that’s a bad fit for inclusion in a schema file.

How about not having a schema file as the default convention?

Watching the “snakes & rubies” video, I liked the django approach
of having the model generate the schema (at least that’s what I think
I saw :).

Right now Rails seems un-pretty to me in this one way, since it
violates DRY. I have to define the attribute in the schema, and then
define the attribute again in the class to put validation logic on
it. I’m constantly clicking back and forth between my schema.sql
file and my various classes while setting up my model; this tells me
something is amiss

For the case of non-legacy schemas, this would be very nice. It
would let the Rails db driver pick the ‘right’ type for columns
(including solving the bool problem, since Rails wouldn’t have to
guess whether a int column was a bool or not).

-tim

sorenson · January 10, 2006, 12:35am

For the case of non-legacy schemas, this would be very nice. It
would let the Rails db driver pick the ‘right’ type for columns
(including solving the bool problem, since Rails wouldn’t have to
guess whether a int column was a bool or not).

Dig into migrations[1] and you’ll see why the generate-once-from-model
is not all that appealing. Basically it only works until you start
caring about your data, which should be not too long after you’ve
started your project if you’re using an agile approach of delivering
running, valuable software early and often.

[1] http://media.rubyonrails.org/video/migrations.mov

David Heinemeier H.
http://www.loudthinking.com – Broadcasting Brain
http://www.basecamphq.com – Online project management
http://www.backpackit.com – Personal information manager
http://www.rubyonrails.com – Web-application framework

sorenson · January 12, 2006, 7:48pm

Thanks for all of the replies. I do wish it were possible to somehow
fold associations and validation code into the schema as there are
relatively low level concerns, even if some of them are implemented
within the database and other are implemented within the model class.
In fact, depending upon the database involved and the options specified,
the schema processor could intelligently assign some “model” duties to
the database (using stored procedures or triggers, etc) for increased
efficiency with no other apparent change in model behavior. It would be
nice to be able to just glance at a more concise data structure and get
a birds eye view instead of having to look across multiple files to see
what makes up the data in your model. Furthermore, it seems that it may
be desireable to allow configurations regarding associations and
validation to “migrate” along with the table schemas, though I suppose a
model file CVS could be kept in lock step with each schema migration.

sorenson · January 10, 2006, 1:38am

Maybe there can be improvements here in creating a UML sort of interface
which just seems the fastest way to design tables(classes) - in fact,
the Agile Rails book initial “paper sketches” pg.51 are like this.
Create Table formats are simple enough though one could even think of
the nice table editor from dbase or access days, etc. The program
initially setting up the db tables (in this fashion) should generate all
the necessary models (and scaffolds if desired) - then as the intial
table creates are changed, the program should be smart enough to be able
to do the necessary Alter Tables and the changes cascade thru (which
Scaffolds already seems to do). Migration is a really powerful start in
this direction - just think a few extras and ease of use polishing could
make it more outstanding.

Likewise this same sort of approach should allow new db’s to be created
and switched within a rails program on the fly as an earlier poster was
seeking (I guess multiple db’s are a tradeoff - data in one mega db
allows all data to be in one place vs. multiple db’s allowing more
performance, decentralizing security risk, etc).

Just some thoughts.

sorenson · January 15, 2006, 7:04am

On 1/12/06, Sorenson [email protected] wrote:

what makes up the data in your model. Furthermore, it seems that it may
be desireable to allow configurations regarding associations and
validation to “migrate” along with the table schemas, though I suppose a
model file CVS could be kept in lock step with each schema migration.

Would it be possible to define a meta language in Ruby that matches
the capabilities of SQL stored procedure language? My idea is that you
would write your schema and stored procedures in this meta language
then you could so several things with it. One would be to run an app
on it that turns it into SQL and then load the SQL into your db.
Second, the rails app could load it and use it to describe the
database. It could then check and see if the stored procedures were
there and use them if found. If they aren’t there (SQLite) the SP
language would be emulated in Rails using basic CRUD statements.

Note that this meta language could look a lot more like ruby than SQL.
It just needs to restrict what you can do so that it can be mapped
back into SQL. Another approach to this concept would be to use SQL as
the meta language. This is already happening at a low level when
Rails queries the table definitions.

The bottom line is that the SP’s end up as methods on the models and
Rails is aware of what they are doing. Sorenson is onto something when
talking about how DRY is violated at the SQL layer.

To take this a little farther, could the description of the table be
part of the model? (it doesn’t have to look like SQL) Then the meta
language SPs would also be located right in the model too. At app
startup you would just verify that the model matches the db, if not
abort or trigger a migration to make things match.

What is the right place for DRY and the database? 1) start from SQL,
translate it to a Rails model, mix in the other methods from the model
that don’t touch the db. (the current scheme) or 2) start from the
model and generate the SQL, and verify that things match?

–
Jon S.
[email protected]

sorenson · January 15, 2006, 7:32pm

I think that schema in model is a very good idea. It can make
development much faster.

I usually start with migrations, run them, create model/controller/view,
new migrations, run, etc.

But it happens that I don’t know the attributes for the model anymore.
(eg: does something have a name or a title?). I used to look at the
migrations, but the problem is: in which migration is my attribute? Now
I start my database manager, and look at the tables. This slows me down.

Let’s create a blog application, start with the model:

class Post < ActiveRecord::Base
attribute :title, :string
attribute :summary, :string
attribute :content, :text
end

Then we run a rake task that updates the database OR EVEN BETTER, we
just start our browser, go to the page, and rails updates the table
automatically.

Now, we want to have authors:

class Author < ActiveRecord::Base
has_many :posts

attribute :name, :string

attribute :email, :string do |e|
e.validate_format :with => /regex/
end
end

This creates a table called authors, and 2 columns: name and email. It
also adds a validation to email.

An author has posts, we need to update that:

class Post < ActiveRecord::Base
belongs_to :author # automatically add author_id to the table

attribute :title, :string do |t|
t.validate_presence
t.validate_length :in => 3…100
end

attribute :summary, :text
attribute :content, :text
end

This model belongs_to :author. That creates an author_id. If you do this
with migrations, you are not following the DRY priciple: relationship in
your schema, AND in your model.

New validations for the title. We noticed that the summart should really
be a :text, and not a :string. We changed that, but this didn’t destroy
our data because we just convert the string field to a text field.

Note that these changes to the models don’t drop the tables and recreate
them: rails should be smart enough to notice that the tables are already
there, and add/update/remove columns only.

So…is this a good idea?

sorenson · January 16, 2006, 1:45am

This is how I’ve been doing it in my home grown system (which I don’t
want to maintain - hence my migration to rails). I use a missing method
function to lookup attribute classes so I can say:

class Post < ORM::Base
att_string :title
att_string :summary
att_text :content
end

For any function call beginning with “att_classname”, it looks up the
“classname” and verifies that it is an attribute. It then adds the
column to the database if necessary. It is nice not to have to go to a
separate place (database shell or schema) to add tables and columns.
The system could be made to prompt you upon restart to verify the
changes to the database. Or the migration concept (which is nice) could
be expanded to the model classes.

Let’s create a blog application, start with the model:

class Post < ActiveRecord::Base
attribute :title, :string
attribute :summary, :string
attribute :content, :text
end

Note that these changes to the models don’t drop the tables and recreate
them: rails should be smart enough to notice that the tables are already
there, and add/update/remove columns only.

So…is this a good idea?

sorenson · January 16, 2006, 2:45am

I think this is a great idea… keep it up.

b

sorenson · January 16, 2006, 3:21am

Merging the schema concept with migrations brings up a few points…

If you start from an empty db and make a migration for everything,
you end up with a forward delta model the creates your database. I’d
rather work with a reverse delta model where the Rails model holds the
current description. Adding the schema info to the Rails model files
provides the current schema and the migration down methods the reverse
deltas.
I don’t see any way for a Rails app to say I need version xxxx of a
db migration before I can run.
Having the current schema distributed in the Rails model files
provides a way to partially test migration code. The test could start
with an empty db, migrate it to the version required by the Rails
app, and then validate the table, column, constraints, etc against the
description in the Rails model files.
It may be possible to autogenerate migrations. For example you edit
the Rails model files and add/delete columns, constraints, tables, etc
and bump the required db version number. A task could then start with
an empty db, migrate it to the previous version number and then
compute the forward/reverse deltas needed for the migration function
to make the schema agree with the one in the Rails model files.

–
Jon S.
[email protected]

sorenson · January 16, 2006, 3:56pm

I’m not sure if I understand #1 The model is always the most
recent version of the schema. The database is updated according to the
models.
Your models are always the version of the schema. So if the schema is
generated from the models, your database will be like your models. If
you release a next version, you’ll have new models, so the database will
always be up to date.
Ah, now I understand what you mean with:
“Merging the schema concept with migrations”

You mean both schema in models and in migrations…that would violate
DRY

Yes, you could generate migrations from models, and then schema from
migrations, but I think it is easier to generate schema from models.

Jules

sorenson · January 16, 2006, 5:16pm

On 1/16/06, Jules [email protected] wrote:

I’m not sure if I understand #1 The model is always the most
recent version of the schema. The database is updated according to the
models.

I think you are looking at my list as if the changes you had proposed
were already implemented. I wrote it from the current viewpoint where
the models don’t yet describe the schema.

I am agreeing with you that I would like to see the schema
incorporated into the models. The list describes what needs to be done
to make migrations play along with that change. You need to add both
proposals to Rails, your schema in the model changes and these changes
to migrations.

Your models are always the version of the schema. So if the schema is
generated from the models, your database will be like your models. If
you release a next version, you’ll have new models, so the database will
always be up to date.

As for number #2 I don’t think there is any way for a Rail app to
declare “I require schema number xxx”. I could manually code it by
looking at the schema info table but this should be part of the
framework. You don’t want to trigger migrations automatically, what
if some one accidentally starts an old version of the app on a
production database and down migrates it losing a bunch of info. The
app should just exit and complain that it doesn’t match the db
version. Some kind of special command needs to trigger the migration.

Ah, now I understand what you mean with:
“Merging the schema concept with migrations”

You mean both schema in models and in migrations…that would violate
DRY

Migration reverse delta’s by nature violate DRY, but DRY is don’t
repeat ‘yourself’. Generating the deltas automatically doesn’t mean
that your are repeating yourself. The only time you need to fix the
delta up manually is when you are using a db feature that isn’t
supported by Rails. So I think schema in the model is consistent with
autogenerated migration deltas in the DRY world.

Yes, you could generate migrations from models, and then schema from
migrations, but I think it is easier to generate schema from models.

I am using the phrase ‘schema in the model’ to describe your proposed
additions to the model for describing the columns/constraints/etc.
That is independent of using the data in the model to generate a SQL
schema.

–
Jon S.
[email protected]

sorenson · January 18, 2006, 3:28pm

Jules

> But it happens that I don't know the attributes for the model

anymore.
>…Now I start my database manager, and look at the tables. This
slows me down.

If you uncomment the line
config.active_record.schema_format = :ruby
in ‘environment.rb’, Rails will store the info in a neatly sorted giant
migration called

   'schema.rb'

Alain

sorenson · January 18, 2006, 4:14pm

At 1/18/2006 08:49 AM, you wrote:

  'schema.rb'

Alain

or just: rake db_schema_dump

sorenson · January 18, 2006, 6:24pm

Rob

> or just:   rake db_schema_dump

It’s not the same:

your tip is a one-shot,
while
uncommenting the config line in environment.rb will make sure the
‘schema.rb’ is updated each time you run a migration.

Alain

>> If you uncomment the line
>>     config.active_record.schema_format = :ruby
>> in 'environment.rb', Rails will store the info in a neatly sorted
>> giant migration called
>>
>>       'schema.rb'


> or just:   rake db_schema_dump

sorenson · January 18, 2006, 8:04pm

I think that schema in model is a very good idea. It can make
development much faster.

I agree. Its often bugged me that I have to define many things
in two places.

I used to think the solution was to have more of the model
relationships and constraints automatically derived from the
database, but since I have to create the DB schema someplace
it might as well be done in the model itself and save me
a step.

end
That is just beautiful.

New validations for the title. We noticed that the summart should
really
be a :text, and not a :string. We changed that, but this didn’t
destroy
our data because we just convert the string field to a text field.

Another thing you could do is decide the column types based
on the attribute’s validation rules. For example, you’d
not necessarily need to define something as :text, rather
you could say:

attribute :description, :string do |t|
t.validate_length :in 1…999
end

The underlying database would use the most appropriate field
to store a string that contains up to 999 characters.

Note that these changes to the models don’t drop the tables and
recreate
them: rails should be smart enough to notice that the tables are
already
there, and add/update/remove columns only.

As others have mentioned doing a migration automatically may
result in a loss of data, such as cases where an older version
of the application is used. I wonder if it is possible to
automatically migrate as long as you can determine that all the
changes are non-destructive?

Losing a column or a relationship would be destructive of course,
but changing a column’s type may not be if the two types are
related.

For example, I think it would be OK for a migration to happen
automatically when I change an attribute from a CHAR(100) to a
TEXT, like this:

attribute :description, :string do |t|
t.validate_length :in 1…100 # uses CHAR(100)
end

To:

attribute :description, :string do |t|
t.validate_length :in 1…999 # uses TEXT
end

Expanding a column like this shouldn’t result in any loss of
data, should it? You could even go smaller, like from a TEXT to a
CHAR(100) if you did a “SELECT MAX(LENGTH(description)) FROM table”
to make sure none of the values was longer than 100 characters.

–

Thanks,

Dan

Dan K. Email: [email protected]
Autopilot Marketing Inc. Phone: 1 (604) 820-0212
Web: http://www.autopilotmarketing.com

sorenson · January 18, 2006, 9:58pm

On 1/18/06, Dan K. [email protected] wrote:

As others have mentioned doing a migration automatically may
result in a loss of data, such as cases where an older version
of the application is used. I wonder if it is possible to
automatically migrate as long as you can determine that all the
changes are non-destructive?

My scheme would be for the apps to declare a version of the schema
that is needed. Then if the db doesn’t match the schema throw an
exception. Uncaught the app will exit on the exception. You could also
catch it and then decide if automigration is allowable. For example
you might only want to allow forward automigrations and not backwards
ones.

In my case I would leave the exception uncaught and use rake to
perform the migration.

–
Jon S.
[email protected]

sorenson · January 19, 2006, 9:49pm

I think that schema in model is a very good idea. It can make
development much faster.

The model and the schema overlap in so many ways. I think it really
does make sense to try to bring them together so that I don’t have to
repeat myself so much.

Whether its in migrations or in the DB schema, I have to specify all
the columns as well as their lengths and unique constraints to have
the database created. Then I have to specify the validates_* rules
in the model to match. Finally I have to keep both of them in sync
over time.

Why not do it all inside the model and have that set up all the
validates_* rules in a single step?

end
This is beautiful.

New validations for the title. We noticed that the summart should
really
be a :text, and not a :string. We changed that, but this didn’t
destroy
our data because we just convert the string field to a text field.

Why bother to specify :text vs. :string at all? When you use the
validate_length
method, and it exceeds 255 characters, the column type should change
from CHAR
to TEXT, eg:

attribute :summary, :string do |t|
t.validate_length :in => 1…100 # column is CHAR(100)
end

Is changed to:

attribute :summary, :string do |t|
t.validate_length :in => 1…999 # column is TEXT
end

–

Thanks,

Dan

Dan K. Email: [email protected]
Autopilot Marketing Inc. Phone: 1 (604) 820-0212
Web: http://www.autopilotmarketing.com

Putting it all in one place with Schemas

In other words, they allow for a ton of specialization and even logic that’s a bad fit for inclusion in a schema file.

[1] http://media.rubyonrails.org/video/migrations.mov

In other words, they allow for a ton of specialization and even logic
that’s a bad fit for inclusion in a schema file.