The performance of find_by_url degrades because children don’t
automatically get a reference to their parent, but they need to get that
parent reference to compute their url. With a structure of:
- home
- articles
- (600 articles here)
- articles
To find the nth article will take 2xn db calls - up to 1200 calls in
this case - each article will make a separate call to retrieve it’s
parent, then its parent’s parent. Not to mention the fact that all 600
of those articles are going to be read out by the call to ‘children’
anyway.
This is the current code (in mental) for find_by_url:
def find_by_url(url, live = true, clean = true)
url = clean_url(url) if clean
if (self.url == url) && (not live or published?)
self
else
children.each do |child|
if (url =~ Regexp.compile( ‘^’ + Regexp.quote(child.url))) and
(not child.virtual?)
found = child.find_by_url(url, live, clean)
return found if found
end
end
children.find(:first, :conditions => “class_name =
‘FileNotFoundPage’”)
end
end
This could be worked around by putting in:
children.each do |child|
child.parent = self
… etc …
That is a bit hacky though, and will restrict children from overriding
parent=()
Is there too much useless flexibility here? Do we need to be able to
have child pages that match urls not defined by their parent and slug
(things like archive_page are still just parent+slug)? Why can’t we use
something like:
def find_by_slug_path(slugs)
if child = children.find_by_slug(slugs[0])
if slugs.size == 0
return child
else
return child.find_by_slug_path(slugs[1…-1])
end
end
end
and the root Page.find_by_url:
def self.find_by_url(url)
root = find_by_parent_id(nil)
slugs = url.split(’/’).select {|x| x.size > 0}
root.find_by_slug_path(slugs)
end
That would only need to make a single call at each level - each call
returning a single page.