I need to serve thesaurus content via AJAX requests. I can think of
several ways to do it, but performance will definitely be an issue - if
there are thousands upon thousands of requests, I want to make sure it’s
as fast and efficient as possible.
So, what do you folks think is the optimal way to go about this? The
obvious route is to use a controller that queries a database for the
word and returns a simple list of synonyms in return, but I wonder if it
would be faster to use some sort of caching? I’m pondering “exploding”
the thesaurus data out into thousands of folders and subfolders and
small text files, and serving it up via Apache:
/aar/aardvark
If Apache returns something, it would be the list of synonyms at
/aar/aardvark. If There is no word there, or it has no synonyms, it
could just return a 404, and the AJAX request would deal with the
failure appropriately. These folders could be nested enough so that no
folder had too many thousands of entries (because that could be a system
bottleneck.)
Memcached would be great for this. You could even simply store the
synonym list for every possible word, which is of course very
inefficient from a storage point of view, but then again, all caching
is by definition.
Memcached would be great for this. You could even simply store the
synonym list for every possible word, which is of course very
inefficient from a storage point of view, but then again, all caching
is by definition.
That would be pretty speedy. One issue - the thesaurus data is about 12
MB per language, so if many languages are available, that could be
hundreds of MB of RAM tied up. Not terrible, but not ideal.
Do you see any issues with the Apache model I mentioned above? I don’t
have much experience with Apache, so I’m unsure if there would be
performance issues to due large numbers of folders/files in the paths.
1- Cache is way faster than the file system
2- Once it is cached, it doesn’t matter if it comes from the file
system or the database
3- Managing your thesaurus in file system could become a big mess
So, I would definitely go for DB + memcached.
Cheers, Sazima
This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.