Tool to compare DB schema against DDL file

First of all any feedback from you guys will be of tremendous help.

I’m looking for a good topic for my thesis (to major in Computer
Engineering) and at the same time I’m encountering a problem in my
current job. Simply stated:

I have a DDL file that describes the entire Data Model of an
application.

We use 6 different environments (including production) with a DB that
must (eventually) match the schema proscribed by the DDL.

We delegate the execution of the DDL to a DBA; I work in a highly
bureaucratic environment so I can’t just go ahead and run DDL myself.
Additionally the app itself is currently running Java, though I want to
use Ruby to generate the tool I need right now.

Such a tool would parse the DDL as an external DSL and then compare it
with the current DB schema, to then generate a series of DDL statements
that would synchronize the two, a sort of DDL diff, if you will.

I’ve done my share of research, looking for a similar tool to avoid
reinventing the wheel. However, I could not reach the author of
db_discovery (http://rubyforge.org/projects/db-discovery/), and
otherwise I found no other author. It seems like db_discovery did not
find a niche even though it seems to be a very good tool.

Does anyone know if such a tool already exists and is actively
supported??? As a newbie Rubyist I’m trying to exploit the power of this
community by querying you guys to make sure such a tool would actually
be novel.

As for its implementation I have a couple of ideas (e.g. parse DDL as an
external DSL using a multiple dispatch library, use ActiveRecord to
reflect on a database’s schema). However, I want to make sure that such
a tool would actually benefit the community.

On 28.07.2010 03:57, Fabian M. wrote:

must (eventually) match the schema proscribed by the DDL.
I’ve done my share of research, looking for a similar tool to avoid
reinventing the wheel. However, I could not reach the author of
db_discovery (http://rubyforge.org/projects/db-discovery/), and
otherwise I found no other author. It seems like db_discovery did not
find a niche even though it seems to be a very good tool.

Does anyone know if such a tool already exists and is actively
supported??? As a newbie Rubyist I’m trying to exploit the power of this
community by querying you guys to make sure such a tool would actually
be novel.

It wouldn’t. This is a solved problem already, for example Sybase has
an excellent product which has sophisticated schema comparison
functionality with a good graphical UI:
http://www.sybase.com/products/modelingdevelopment/powerdesigner

As for its implementation I have a couple of ideas (e.g. parse DDL as an
external DSL using a multiple dispatch library, use ActiveRecord to
reflect on a database’s schema). However, I want to make sure that such
a tool would actually benefit the community.

I would create these tools:

  1. schema extractor into a standard format which you define (could be an
    XML representation or just marshalled object model).

  2. diff for the standard format schema (two way, maybe also three way)

  3. maybe also a converter from a given DDL to the standard format so you
    do not have to go through the DB although this might be fragile since
    the actual DB’s settings may influence how the schema eventually looks.

Kind regards

robert

Robert K. wrote:

On 28.07.2010 03:57, Fabian M. wrote:

must (eventually) match the schema proscribed by the DDL.
I’ve done my share of research, looking for a similar tool to avoid
reinventing the wheel. However, I could not reach the author of
db_discovery (http://rubyforge.org/projects/db-discovery/), and
otherwise I found no other author. It seems like db_discovery did not
find a niche even though it seems to be a very good tool.

Does anyone know if such a tool already exists and is actively
supported??? As a newbie Rubyist I’m trying to exploit the power of this
community by querying you guys to make sure such a tool would actually
be novel.

It wouldn’t. This is a solved problem already, for example Sybase has
an excellent product which has sophisticated schema comparison
functionality with a good graphical UI:
http://www.sybase.com/products/modelingdevelopment/powerdesigner

Is Sybase open source? I’m disregarding proprietary, non open source
solutions.

As for its implementation I have a couple of ideas (e.g. parse DDL as an
external DSL using a multiple dispatch library, use ActiveRecord to
reflect on a database’s schema). However, I want to make sure that such
a tool would actually benefit the community.

I would create these tools:

  1. schema extractor into a standard format which you define (could be an
    XML representation or just marshalled object model).

  2. diff for the standard format schema (two way, maybe also three way)

  3. maybe also a converter from a given DDL to the standard format so you
    do not have to go through the DB although this might be fragile since
    the actual DB’s settings may influence how the schema eventually looks.

Kind regards

robert

How about this option?

Use RoR Active Record to create a model from a DDL file, and also a
comparable model from a DB, then generate a module that can compare
equivalent Entity/Table classes from both?

On 28.07.2010 16:50, Fabian M. wrote:

supported??? As a newbie Rubyist I’m trying to exploit the power of this
community by querying you guys to make sure such a tool would actually
be novel.

It wouldn’t. This is a solved problem already, for example Sybase has
an excellent product which has sophisticated schema comparison
functionality with a good graphical UI:
http://www.sybase.com/products/modelingdevelopment/powerdesigner

Is Sybase open source?

No.

I’m disregarding proprietary, non open source
solutions.

Well, you didn’t say so initially if I’m not mistaken.

  1. diff for the standard format schema (two way, maybe also three way)

  2. maybe also a converter from a given DDL to the standard format so you
    do not have to go through the DB although this might be fragile since
    the actual DB’s settings may influence how the schema eventually looks.

How about this option?

Use RoR Active Record to create a model from a DDL file, and also a
comparable model from a DB, then generate a module that can compare
equivalent Entity/Table classes from both?

The key question is: can an AR model represent a DB schema entirely?
What does it tell you about constraints and triggers? Does it cover all
sorts of special indexes as Oracle’s FBI’s and special types found only
in one RDBMS?

Kind regards

robert

Robert K. wrote:

The key question is: can an AR model represent a DB schema entirely?
What does it tell you about constraints and triggers? Does it cover all
sorts of special indexes as Oracle’s FBI’s and special types found only
in one RDBMS?

Kind regards

robert

The answer might be that it cannot, it’s a matter of where you put your
logic. RoR relies on logic in the AR model even for testing constraints
like uniqueness, as far as I know. They themselves admit RoR is
opinionated in many respects. I guess such a tool won’t be useful for
developers that have opted for an architecture strongly reliant on the
Database (i.e. using triggers, stored procedures, sequences and so
forth).

And I do realize I did not state I was looking for open source
alternatives. I’ve looked in Rubyforge, and the ruby gems repository,
and have not found much.

On Wed, Jul 28, 2010 at 10:50 AM, Fabian M. [email protected]
wrote:

community by querying you guys to make sure such a tool would actually
be novel.

How about this option?

Use RoR Active Record to create a model from a DDL file, and also a
comparable model from a DB, then generate a module that can compare
equivalent Entity/Table classes from both?

RoR ActiveRecord can dump a ruby representation of the database schema
(db:schema:dump) and another rake task (db:structure:dump) that dumps
the database schema as SQL CREATE TABLE statements.

I have a (really, extremely very ugly) script to “diff” ruby-schema
files:

ruby schemadiff.rb schema.production schema.development
schema.production has 22 tables
schema.development has 24 tables
Tables removed:
resources
Tables added:
faq
page_translations
surveys
Table answers:
Columns added:
created_at datetime
updated_at datetime
Table inquiries:
Columns added:
spam boolean
Table question_groups:
Columns added:
custom_class string
custom_renderer string
Table questions:
Columns removed:
is_mandatory integer
Columns added:
is_mandatory boolean
Table resources:
Columns removed:
created_at datetime
updated_at datetime
parent_id integer
size integer
content_type string
filename string

cat schemadiff.rb
def read_schema(filename)
{}.tap{|h|
File.read(filename).scan(/^\W*(create_table\W*“(.?)".?$.?^\Wend)$/m).each{|s|
h[s[1]] = s[0].scan(/^\Wt.(\w+)\W”(\w+)".*?$/m)}}
end

cur = read_schema(ARGV[0])
new = read_schema(ARGV[1])

puts ARGV[0] + “\thas #{cur.keys.size} tables”
puts ARGV[1] + “\thas #{new.keys.size} tables”

puts “Tables removed:\n\t” + (cur.keys.sort -
new.keys.sort).join(“\n\t”)
puts “Tables added:\n\t” + (new.keys.sort -
cur.keys.sort).join(“\n\t”)

cur.keys.sort.each do |t|
rem = (cur[t].sort - (new[t] || []).sort).map{|a| “#{a[1]}\t#{a[0]}”}
add = ((new[t] || []).sort - cur[t].sort).map{|a| “#{a[1]}\t#{a[0]}”}
next if rem == add
puts “Table #{t}:”
puts “\tColumns removed:\n\t\t” + rem.join(“\n\t\t”) unless rem == []
puts “\tColumns added:\n\t\t” + add.join(“\n\t\t”) unless add == []
end

unknown wrote:

RoR ActiveRecord can dump a ruby representation of the database schema
(db:schema:dump) and another rake task (db:structure:dump) that dumps
the database schema as SQL CREATE TABLE statements.

I have a (really, extremely very ugly) script to “diff” ruby-schema
files:

cat schemadiff.rb
def read_schema(filename)
{}.tap{|h|
File.read(filename).scan(/^\W*(create_table\W*"(.?)".?$.?^\Wend)$/m).each{|s|
h[s[1]] = s[0].scan(/^\Wt.(\w+)\W"(\w+)".*?$/m)}}
end

cur = read_schema(ARGV[0])
new = read_schema(ARGV[1])

puts ARGV[0] + “\thas #{cur.keys.size} tables”
puts ARGV[1] + “\thas #{new.keys.size} tables”

puts “Tables removed:\n\t” + (cur.keys.sort -
new.keys.sort).join("\n\t")
puts “Tables added:\n\t” + (new.keys.sort -
cur.keys.sort).join("\n\t")

cur.keys.sort.each do |t|
rem = (cur[t].sort - (new[t] || []).sort).map{|a| “#{a[1]}\t#{a[0]}”}
add = ((new[t] || []).sort - cur[t].sort).map{|a| “#{a[1]}\t#{a[0]}”}
next if rem == add
puts “Table #{t}:”
puts “\tColumns removed:\n\t\t” + rem.join("\n\t\t") unless rem == []
puts “\tColumns added:\n\t\t” + add.join("\n\t\t") unless add == []
end

Well I’ll grant you one thing, it sure ain’t pretty but it gets the job
done, at least with simple requirements. It even seems surprising that
it is fairly short. Then again, have you ever tried parsing DDL files
to generate a similar diff? I would think even mixing (a ruby schema
against a DDL file) would be possible.

Did you code this for fun or did your job demand it?

On Wed, Jul 28, 2010 at 9:48 PM, Fabian M. [email protected] wrote:

Well I’ll grant you one thing, it sure ain’t pretty but it gets the job
done, at least with simple requirements. It even seems surprising that
it is fairly short. Then again, have you ever tried parsing DDL files
to generate a similar diff? I would think even mixing (a ruby schema
against a DDL file) would be possible.

I’ve not tried it; it should work, but I’m more comfortable reading
Ruby than SQL :slight_smile:

Did you code this for fun or did your job demand it?

Mostly fun :slight_smile: one recent case where it’s handy is with an internal
Rails app I have that hits an Oracle database for an external PHP app;
for development, I originally did a schema:dump from Oracle and a
schema:load into H2. The script provides a quick way to get an idea
of what they’ve changed in the database.

At present, I’m just reading the existing data, so losing
database-side integrity in the rails schema.rb doesn’t cause much
trouble.

On 07/29/2010 03:43 AM, Fabian M. wrote:

opinionated in many respects. I guess such a tool won’t be useful for
developers that have opted for an architecture strongly reliant on the
Database (i.e. using triggers, stored procedures, sequences and so
forth).

But what are you looking for? The subject says you want to compare a
schema and a DDL file now it seems you need to compare a schema and an
AR model. These are two quite different sets of requirements.

Cheers

robert

Robert K. wrote:

I would create these tools:

  1. schema extractor into a standard format which you define (could be an
    XML representation or just marshalled object model).

  2. diff for the standard format schema (two way, maybe also three way)

  3. maybe also a converter from a given DDL to the standard format so you
    do not have to go through the DB although this might be fragile since
    the actual DB’s settings may influence how the schema eventually looks.

Kind regards

robert

Robert, wouldn’t an AR model fit the bill for the marshalled object
model in item #1 in your list? We’re thinking similarly here. As long
as you can transform information into a common intermediate format, you
will be able to control it and digest it in useful ways. That info can
come from a DB schema, a DDL file or any other form

I have a DDL file that describes the entire Data Model of an
application.

Is this just for oracle then?
I wrote one once for mysql :slight_smile:
-r

Robert K. wrote:

But what are you looking for? The subject says you want to compare a
schema and a DDL file now it seems you need to compare a schema and an
AR model. These are two quite different sets of requirements.

Cheers

robert

I ultimately need to compare a DB (extracting data to a DB custom tool
or from the system catalogs) and a DDL file. However it might be easier
if one or both of those forms are translated to an intermediate, more
easily comparable form such as … an AR model! The idea is basically
to find an intermediate form that allows easy comparison of the two: a
DB’s current schema, and a DDL file. Translating information to this
intermediate form would also allow maximum flexibility (e.g. how about
comparing two DDL files, two databases, a ruby schema and a DDL file).

Roger P. wrote:

I have a DDL file that describes the entire Data Model of an
application.

Is this just for oracle then?
I wrote one once for mysql :slight_smile:
-r

Actually this would be for DB2. I’m not particularly fond of it but
I’ll let you guess why in my project we have no choice in what DBMS
vendor to go with. I want it to at least work for DB2; whether expanding
to other vendors would benefit anyone would depend on whether there is
no open source tool for the job already.

Roger, what was it that you wrote once for mysql? Could you comment a
bit more on it?

Ah mysql, how I miss thee …

Roger, what was it that you wrote once for mysql? Could you comment a
bit more on it?

http://rubyforge.org/projects/table-syncer/
The only thing I’ve heard about db2 have been from antonio’s blog:

GL.
-r