Ruby Forum Ruby on Rails > Association Field Find Caching Technique

Posted by Steve Martocci (smartocci)
on 21.04.2008 19:14
Hi everyone,

I tend to be searching my association collections a lot for specific
field values.  I could do a foo.bars.find_all_by_fieldname(fieldvalue)
each time, but why hit the database over again?  Especially when the
full collection is already in memory.  I wrote this module to extend
associations to look at the already loaded collection for field/value
matches.


module MyAssociationExtentions

  def field_find(field, value, opts = {})
    @field_cache = nil if opts['reload']
    ((@field_cache ||= {field => {}})[field] ||= {})[value] ||=
self.is_a?(Enumerable) ? (self.select { |task| task.send(field) == value
}) : (self if self.send(field) == value)
   end


end


Here it is again written out into multi lines so it is easier to read in
the forum

module MyAssociationExtentions
  def field_find(field, value, opts = {})
    @field_cache = nil if opts['reload']
    @field_cache = {} unless @field_cache
    @field_cache[field] = {value => []} unless @field_cache[field]
    @field_cache[field][value] ||= self.is_a?(Enumerable) ? (self.select
{ |task| task.send(field) == value }) : (self if self.send(field) ==
value)
  end
end

Questions:
1) I use a multi-dimentional hash to store each potential field/value
lookup.  Is this too memory intensive?
2) Does this even theoretically improve performance vs the database? or
is it a waste of time
3) Is there a better way to write that line (all those annoying checks
to see if the hash is already there)
4) could I push this into memcache to lower the memory usage by
distributing it across mongrels.

Thanks
Posted by Daya Sharma (Guest)
on 22.04.2008 03:02
(Received via mailing list)
Steve,

Have you tried to benchmark your solution, this should answer the
question whether this solution has any performance gains. Generally
speaking anything stored in physical memory is accessible much much
faster than any IO operations.

Also in your solution, when do you invalidate the @field_cache and re-
read the field values from database ?

Regards,
-daya

On Apr 21, 12:14 pm, Steve Martocci <rails-mailing-l...@andreas-s.net>
Posted by Steve Martocci (smartocci)
on 23.04.2008 16:47
Hi Daya,

Don't know how I missed out on require 'benchmark', but I did some 
testing with it, and it is so much faster for finding by field.

The first time it runs it performs about the same as doing a find_by 
because it hasn't loaded the collection, if the collection is already in 
memory it is lightning fast.  I have added a reload flag that will skip 
the use of the @field_cache in case dynamic data is being used.

Once the collection loaded finders should not hit the DB anymore, they 
are too expensive, Let me know if you see any holes in this. -Steve

Here are some benchmarks

setup
 foo = Foo.find(:first)

Hitting the DB on each look
Benchmark.bm { |x| x.report { 5000.times { 
foo.bars.find_all_by_some_field(1) } } }
      user     system      total        real
 14.260000   3.030000  17.290000 ( 18.280076)

Without Field Caching

 Benchmark.bm { |x| x.report { 5000.times { 
foo.bars.field_finder("some_field", 1, true) } } }
      user     system      total        real
  0.210000   0.070000   0.280000 (  0.269146)

With Field Caching

Benchmark.bm { |x| x.report { 5000.times { 
foo.bars.field_finder("some_field", 1, false) } } }
      user     system      total        real
  0.110000   0.040000   0.150000 (  0.155943)