I am planning a project which is very heavily built on the premise of
users being able to upload lots of data (maybe a bit like Flickr but not
for photos). They may choose to upload quite a bit in one go - perhaps
up to 100MB at the extreme, and they will upload a few MB every week or
even day.
I have great worries about doing this in Rails however, as I understand
that a Rails instance blocks during an upload. Having multiple instances
running behind Mongrels isn’t really a feasible solution, because even
with as little as 1000 uses I’d be worried that all it would take is a
few of them to upload over slow connections and all my instances are
blocking.
I’m sure there’s a better way to handle this type of app, but I’m very
eager not to start building my project if Rails is inherently not suited
to my project. If anyone has any advice on Rails’ blocking issue, or
things to look at, or things I can do to work around this issue smartly,
I would be deeply grateful.
that a Rails instance blocks during an upload. Having multiple
things to look at, or things I can do to work around this issue
smartly,
I would be deeply grateful.
The uploading itself DOESN’T block your mongrel, it’s the processing
afterwards that will use single threaded Rails and thus block the
mongrel instance. Now there are a few ways to handle this flow:
If it’s just saving the file afterwards, plugins like attachment_fu
will just move the Tempfile into a permanent file, which shouldn’t
take very long
If you need to do processing afterwards, you hand the file over to
a backgroundrb process and let that do the processing, use a
PeriodicalUpdater on your page to periodically poll a Rails method
that in its turn asks the backgroundrb worker its progress. The
advantage of this approach is that you can basically let the user
continue doing other stuff while the upload is being processed
(similar to what happens when you upload a movie to youtube)
A third approach would be to use Merb for handling the file uploads.
Merb is a different framework, you’ll run it on a separate mongrel
and port, but it’s multithreaded and thus very suited for file upload
handling (amongst a lot of other things, it’s a really nice
framework, but you don’t get as much out-of-the-box as you do in
Rails). It can use ActiveRecord (and IIRC even Rails’ models).
Afterwards you redirect the user back to the rails app.
All I’ll want to do with the uploaded file is take a hash of it (to use
as an ID for various reasons) and store it in the filing system, a note
of its location will be entered in a new database record.If the
uploading phase doesn’t block and leaves that instance open to serving
new connections it sounds like I might be able to get away with doing
nothing at all?
However, in the future I may like to index any text in the uploaded file
(for search purposes), and so I may go with a backgrounDrb solution from
the beginning (this sounds like it will be easier to provide a progress
indicator to the user too, which will be necessary for big files).
I haven’t looked at attachment_fu, so I’ll go and take a look at that
now, and see how that may fit in with my plans.
Thanks for your kind reply, it’s given me plenty to think about
That’s fantastic Peter, thanks again for your reply, it’s more than
helpful.
I hadn’t considered a Flash uploader on the client, but I can see the
clear advantages despite the purist in me wanting to keep to
browser-supplied technologies. I’ll take a look at SWFUpload.
I haven’t looked too far into the Rails 2.0 changes (although I quickly
realised the scaffold differences) so your comment about the new
security measures that are causing you problems is intriguing. At the
risk of asking you to spend more of your time in this thread would it be
possible for you to expand on that with a link to some of the issues if
you have a moment?
I haven’t looked too far into the Rails 2.0 changes (although I
quickly
realised the scaffold differences) so your comment about the new
security measures that are causing you problems is intriguing. At the
risk of asking you to spend more of your time in this thread would
it be
possible for you to expand on that with a link to some of the
issues if
you have a moment?
Check all the messages after yours in the list The topic came up
just recently. For now I’m not upgrading my existing Rails 1.2.6 apps
until a 100% working solution has been found.
All I’ll want to do with the uploaded file is take a hash of it (to
use
as an ID for various reasons) and store it in the filing system, a
note
of its location will be entered in a new database record.If the
uploading phase doesn’t block and leaves that instance open to serving
new connections it sounds like I might be able to get away with doing
nothing at all?
Attachment_fu will save the files to either the filesystem or the
database (and do some thumbnailing if you need it). You can use the
callbacks of attachment_fu to calculate the hash of the file, index
the text, …
However, in the future I may like to index any text in the uploaded
file
(for search purposes), and so I may go with a backgrounDrb solution
from
the beginning (this sounds like it will be easier to provide a
progress
indicator to the user too, which will be necessary for big files).
For upload progress there are two possible solutions:
Use a Flash uploader such as SWFUpload (swfupload.org), this is a
fantastic solution I’ve used in several of our apps. The nice thing
about SWFUpload to me is that you can filter out filetypes/maximum
size clientside, the upload stream is monitored client side and the
upload dialog allows multiple file selection. Sadly, Rails 2’s
security measures and cookie based sessions have broken flash
uploaders and the solutions that have come up so far apparently don’t
do the job on all browsers.