Forum: Rails-core (closed, excessive spam) [PATCH] Support for DB Clusters/Replication in ActiveRecord

Posted by Stephen (Guest)
on 2006-07-26 05:03
(Received via mailing list)
Hello,

Due to the Trac situation, I am forwarding this patch along to this list 
in
the hopes of elicting some feedback.  I will be sure to open a proper 
ticket
when possible.

I have done some extensive work on active record to allow for databases 
to
be defined in terms of "connection pools" rather than simply one 
database.
Most of the work was completed in connection_specification.rb and is 
done at
this level  to be compatible with any backend you choose.   Also this 
patch
maintains full compatibiltity with the current paradigm so no changes 
are
necessary to current applications/documentation.

Defining a database pool is very easy and follows a common sense
convention.  I will assume you are running with RAILS_ENV=development 
for
this example.  You first define your connections as you always have in
database.yml .  Your are then able to able to define
  development_read_pool: db1, db2, db3
  development_write_pool: db3, db4, db5  (In general, the name is
RAILS_ENV_write_pool so you can test your clusters in development and
production with no config changes)

where each is a set of connection names that you have already defined.
ActiveRecord::Base.connection will then return connections from the
appropriate pool using round-robin when more than one connection is
available to it.  This is handy if you hvae a high traffic website for
example and you want to load balance over several slave servers for 
reading
while writing to the one master server consistently.  The syntax for
.connection is as follow

ExampleModel.connection            # default "compatibility behavior" 
always
returns a write connection.
ExampleModel.connection(:read)  #  Return a connection from the read 
pool
ExampleModel.connection (:write)  #  Return a connection from the write 
pool

I have also changed  a few of the functions in base.rb to utilize the
correction pool (for example, find_by_sql calls connection(:read))  This
again makes the patch seamless to end applications while allowing them 
to
use the new functionality.

A patch aganist the lastest CVS is attached.  Some rough notes on
implementation are also include below - there not 100% complete but give 
a
good idea of what was changed.

This currently passes an ActiveRecord rake with flying colors.

Implementation:

1.  Added two arrays to act as pools for connections related to this
connection.

   @@read_connection_pool = {}
   @@write_connection_pool = {}

   Each is later defined as an array such that

    @@read_connection_pool[name] represenets all of the read connections
available to the current
        class. This allows us to stay fully backward compatiable with 
the
old methodology where name
    is @active_connection_name


2.  Define 2 variables to track index of last used connection.  This is 
used
when doing round-robin

    @@last_read_connection = 0
    @@last_write_connection = 0

3.  Define a function for appending connections to pools
(append_spec_to_connection_pools spec).
    If the config has an attribute read_only == true then it is only 
entered
into the read_connection_pool
    and vice versa for the write_connection_pool.  @@defined_connections
also is defined at the same time.

    @@write_connection_pool[spec.object_id] always equals
@@defined_connections[spec.object_id]


5.  Define a function establish_connection_pools which looks for two
variables to be
    set in the configuration:

    RAILS_ENV_read_connection_pool
    RAILS_ENV_write_conenction_pool

    Normally in Rails, this would setup in the YAML file.  Each variable 
is
a comma delimited
    list of connections to use, one for read the other for write.


6.  Define a function clear_connection_pool which clears the connection
pools.

7.  Modify establish_connection
    a.  When passed nil, call establish_connection_pools(RAILS_ENV)
    b.  When passed a ConnectionSpecification, clear all connection 
pools as
well as the
        active_connection_name

8.  Move code from establish connection to ConnectionSpecification
constructor so an object can be made
    out of any spec thats passed to it.   (Avoiding breakage of DRY
principle)


9.  Remove settings @@defined_connections[name] in establish_connection.
This is now done by
    calling


10.  Define two functions, round robin read and round robin write which
     returns AbstractAdapters from the various pools.

11.  Modify retreive connection so that it takes an id of an object that
should be in @@defined_connections
     Remove it's dependency on connection= as this will break things.

12.  Modify connection= to call establish_connection on
ConnectionSpecification.

13.  Moidfy self.remove_connection to call clear_connection_pool

14.  Modify active_connection_name to check if a pool exists instead of
defined_connections

Patch attached is aganist whats in the subversion repo.

Best Regards,
Stephen Blackstone
This topic is locked and can not be replied to.