All,
While attempting to use ActiveRecord against a large database, approx.
1700 tables and 4 million rows, I discovered an inefficiency in
oci_adapter.rb. The problem seems to be the SQL being generated to
lazy-load the columns metadata. “def columns(table_name, name = nil)”
method generates the following SQL:
select column_name, data_type, data_default, nullable,
decode(data_type, ‘NUMBER’, data_precision,
‘VARCHAR2’, data_length,
null) as length,
decode(data_type, ‘NUMBER’, data_scale, null) as scale
from user_catalog cat, user_synonyms syn, all_tab_columns col
where cat.table_name = ‘#{NAME_OF_TABLE}’
and syn.synonym_name (+)= cat.table_name
and col.table_name = nvl(syn.table_name, cat.table_name)
and col.owner = nvl(syn.table_owner, user)
At first glance, the query looks okay. Oracle should apply the WHERE
clause on the FROM tables and then apply the DECODE on the results.
Unfortunately, that’s not the case. The decode is making the query very
expensive. In my case, 10g running on Sun V880 server, with decode takes
35 seconds, without decode takes less than a second.
I patched the columns method to do the “decode” in Ruby.
def columns(table_name, name = nil) #:nodoc:
table_name = table_name.to_s.upcase
owner = table_name.include?('.') ?
“’#{table_name.split(’.’).first}’” : “user”
table = “’#{table_name.split(’.’).last}’”
scope = (owner == “user” ? “user” : “all”)
table_cols = %Q{
select column_name, data_type, data_default, nullable,
data_precision, data_length, data_scale
from #{scope}_catalog cat, #{scope}_synonyms syn,
all_tab_columns col
where cat.table_name = #{table}
and syn.synonym_name (+)= cat.table_name
and col.table_name = nvl(syn.table_name, cat.table_name)
and col.owner = nvl(syn.table_owner, #{(scope == “all” ?
“cat.owner” : “user”)}) }
if scope == "all"
table_cols << %Q{
and cat.owner = #{owner}
and syn.owner (+)= cat.owner }
end
select_all(table_cols, name).map do |row|
length =row['data_length'] unless row['data_precision']
row['data_default'].sub!(/^'(.*)'\s*$/, '\1') if
row[‘data_default’]
OCIColumn.new(
oci_downcase(row[‘column_name’]),
row[‘data_default’],
row[‘data_type’],
length,
row[‘scale’],
row[‘nullable’] == ‘Y’
)
end
end
Also, I didn’t understand why only VARCHAR2 lengths were being picked
from the data_length column. What about VARCHAR, CHAR, CLOB, etc.? I
changed the length value to apply to all types of columns that don’t
specify precision.
Am I missing something? Let me know what you think about this patch and
whether or not it should be proposed.
Thanks,
- Zak