I want radiant to have a more robust caching mechanism than the current
‘expire every x minutes’ method of doing things - I’d like it to be able
to handle just removing things from the cache when things actually
change. I’ve roughed out a plan on how I think this should be done in a
safe manner.
It sounds like it should work, but this is mainly just what I thought of
on the way to work this morning - I’d like people to poke holes in it
and figure out any edge-cases I’m missing out on - if this process EVER
misses out on the clearing of a cached page, it’s pretty much useless.
I see a bit off overhead being added to the rendering process (since
after_initialize hooks are a performance no-no), but since this should
reduce the need for rendering by a huge amount, I’m hoping it’ll balance
out.
I will probably put off an implementation of this until after I finish
off my asset extension and get other things running, but I might pull
this forward for asset caching as well… not sure yet.
Example Site layout:
Pages:
+ Homepage ('main', 'sidebar')
+ Articles ('main', 'sidebar')
- Article1 ('full', 'extract')
- Article2 ('full', 'extract')
Snippets:
- Header
- Footer
Layouts
- Default
Homepage:
<r:find url=‘articles’>
<r:children:each>
<r:title/>
<r:if_content part=“extract”/>
<r:content part=“extract”/>
</r:if_content>
</r:children>
</r:find>
‘default’ layout:
<r:snippet name=“header”/>
<r:content/>
<r:snippet name=“footer”/>
In this example, the homepage needs to be re-rendered when:
- homepage is modified
- new article child is created
- article1/2 is updated/removed
- article1/2 has extract added/removed
- default layout is modified
- header snippet is modified
- footer snippet is modified
- (please tell me if I’m missing cases here)
To know all those things, the following should be more than enough to
keep track of:
- any page which is read from db during the render
- any snippet which is read from db during the render
- any layout which is read from db during the render
New Table:
create_table 'cache_dependencies' do |t|
t.column :page_id, :integer
t.column :depends_on, :integer
t.column :cache_type, :string
end
Since this is just a scratch table, perhaps this should use something
like madeline or maybe just a tree of files that are locked on write
instead of being in the main database file.
Filesystem cache:
pages/
page.data
page.cacheinfo
An after_save is added to each snippet/layout:
CacheDependencies.find_all_by_cache_type_and_depends_on('snippet',
snippet.id) do |cd|
if(File.exists(“pages/#{cd.page_id}.cacheinfo”))
File.rm(“pages/#{cd.page_id}.cacheinfo”)
end
end
An after_save is added to each page to do the same, and also an
after_create and after_destroy are added to page that will clear
anything depending on the parent when a child is created or removed.
A new class, CacheInfo is registered as an ‘after_initialize’ on Page,
Layout and Snippet.
When rendering begins, initialize the CacheInfo object:
CacheInfo.reset
When a page/layout/snippet is loaded, it’s id is added to a hash in
CacheInfo.
When the rendering is complete, the cacheinfo is written to the
cache_dependencies table, looking something like:
CacheDependencies.transaction do
CacheDependencies.delete_all(:conditions => ['page_id = ?',
page.id])
page_depends.each do |pd|
CacheDependencies.create(:page_id => page.id, :depends_on =>
pd, :cache_type => ‘page’)
end
snippet_depends.each do |pd|
CacheDependencies.create(:page_id => page.id, :depends_on =>
pd, :cache_type => ‘snippet’)
end
layout_depends.each do |pd|
CacheDependencies.create(:page_id => page.id, :depends_on =>
pd, :cache_type => ‘layout’)
end
end
On each render, if the page.cacheinfo file exists, the page is loaded
- if so, render the page from scratch
- if not:
- send a 304 if the If-Modified-Since header is valid
- otherwise, send the cached page.data file.
- There will be a setting to use X-Sendfile if it’s
available to hand off the heavy lifting to apache/lighttpd
A .cacheinfo file could also by created to contain a time-based expiry
for pages that use out-of-db data
or pages that have time-sensitive data (current date, etc), or some
other cache-clearing condition.
I’ve made the assumption above that parts are never updated without a
page also being updated, but it should just be a case of adding an extra
item to the cache table to account for that.
Note that this is also extremely extensible - new tables in the database
can add in their own after_initialize hooks and after_save hooks to add
dependencies and clear layouts without having to interact with the rest
of the caching system.
Dan.