Forum: Ruby on Rails scalable file uploads with Rails

Announcement (2017-05-07): is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see and for other Rails- und Ruby-related community platforms.
Mr_Tibs (Guest)
on 2009-02-07 03:11
(Received via mailing list)

I'm involved in a project where I have to re-architect file uploads in
a Rails application to make it scalable. Users will be uploading large
XML files (approx. 1MB) with high probability of overlap (upload at
the same time) - which we try to minimize. The current system runs
Mongrel cluster (3 Mongrels) and Apache mod proxy balancer. The file
upload is done using attachment_fu.

What choices do I have?
1. Throw more Mongrel processes in the Mongrel cluster. We are already
have other applications running Mongrel clusters on the same machine,
so this option is limited.

2. Use BackgrounDRb. I looked a bit into BackgroundDRb, but I'm not
sure it can help. Even if a middleman passes the upload task to a
worker process, would that work? First of all, can you even pass the
upload task? How would you do it? Would that completely free up the
Mongrel process? Would I have to scale the BackroundDRb process, or is
there scalability built in? I couldn't find an example on the web that
does just that.

3. Use Merb. I'm still trying to get my head around it. I found 2
examples that show how to do file uploads with Merb, but they are
kinda old, and Merb went through a lot of changes in the last year.
Even if I could get one upload example working, how do I deal with
scalability? Would I start a bunch of these Merb processes and use a
proxy balancer to distribute the file uploads? From what I'm reading,
these would take much less memory than having Mongrel processes
running Rails, so I guess that would help me. I don't think I've seen
any examples on the web that do it.

4. Write my own cgi c/c++ upload functionality. This will get nasty
because files are transmitted with multipart where each packet has a
header, etc. If I could get this to work, then I leave the upload
functionality to Apache (which I guess would do a good job about
scaling the uploads and it will be fast too) and I'll run some Ruby
cron jobs which parse the files on the web server.

I appreciate feedback to any of these choices.

AD (Guest)
on 2009-02-08 06:19
(Received via mailing list)
if you are up to it, you can also use JRuby.  JRuby uses native
threads so you should get good non-blocking performance without having
to configure any "runtimes".  I use it and get great performance.

snacktime (Guest)
on 2009-02-08 08:02
(Received via mailing list)
> We started using the nginx upload module about a month ago and it works
> great.  Whatever you do you don't want rails in the file upload loop on a
> busy site.  You can easily starve out other requests and put your servers
> into a death spiral.


This topic is locked and can not be replied to.