Adjacent List or Nested Sets?


#1

All,
I’m currently developing an application that is very dependent on
hierarchical data. In this application I will have 500,000+
trees(roots), with an average range of 1-20+ data members (children +
leaf nodes) for each tree. In this application, there will be more
reads than writes, but will still have a fair number of writes, mostly
just adding additional objects to the end of the hierarchy.
Essentially, it would be like a forum where you have a very large
number of root threads, and a fairly small number of threads
underneath it (1-100 replies).

Currently, in my beta I’m using act_as_tree (adjacent list) with out
any problems, but it’s just me and some test data. Eventually, there
will around 1,000 concurrent users reading, traversing, and changing
the hierarchies/trees. After doing some research on the topic I
discovered the better nested set plug-in, and started to learn about
nested sets.
During my research I noticed that many authors highlight that read
performance in greatly enhanced with nested sets, while write
performance was greatly reduced especially as the tree grows. They
also point out that a write requires a table lock
(http://dev.mysql.com/tech-resources/articles/hierarchical-data.html)
which seems like a pretty high cost when used in a situation where the
hierarchy will change fairly regularly as opposed to something like a
menu system.

For my application, which of these methods is the best fit? Is there
a better method out there for my application? I have been bouncing
back and forth on the strengths of each method for my application. I
appreciate any advice you have to offer!

A couple of notes:

  • Currently using Postgresql 8.1 and plan to move to 8.2
  • Currently developing with Rails 1.2RC

Thanks!
-Nick