On 5/3/06 10:26 PM, “Bob H.” [email protected] wrote:
On May 3, 2006, at 5:13 PM, Zed S. wrote:
>> This second processing is to find the multipart boundaries and uses
>> a lot of
>> horrible regex and backtracking. So, with large uploads you can
>> see pretty
>> big CPU spikes and fill up your rails processes fairly quick.
> Does the same issue exists for downloading?
No since Mongrel or a fronting web server handles those so you get much
better concurrency. Now, if your rails app is generating the content
each request then you’re screwed.
Remember, it’s not just queue length but how long each request stays in
queue. If you have 10 backend Mongrels, then you have a queue length of
(basically). That doesn’t mean that you can only handle 10/second.
are really weird and kind of don’t make sense until you simulate them.
have to use some statistical distribution of time for each request to
idea of how such a queue performs. And nothing beats straight up
measurement with a tool like httperf. It’s the reality bringer.
This problem is tending to the uglier side of things, so maybe a bit
of an ugly hack won’t look so bad?
What you mention below isn’t a hack, it’s actually the primary advantage
using Mongrel over fastcgi. If a Rails action is slow, you can spend a
more effort and write a Mongrel Handler that will do it faster. It’s a
little tricker–kind of bare metal–but not impossible.
What happens if you use a Mongrel handler for the file upload,
storing the file somewhere on disk, then interfere with the normal
proceedings? Maybe hacking at the request object sent to Rails by
adding a query parameter or HTTP header that says where the file was
put? Maybe a redirect with the file name instead of the file?
Bingo. I actually have documentation in the queue for such a thing.
you could do is have the uploads be done with mongrel, have even an ajax
progress thing done with mongrel, and when it’s all finished bounce it
to rails to complete the process.
Best of all, if you did it and managed to completely avoid an
shooting for a nice REST uplaod/progress/done process, you could even
that up by writing a little apache or lighttpd module. Of course you’d
to be REALLY desperate to do that, but the possiblity is there.
Key with this is to avoid Ruby’s sessions. What you can do is grab the
cookie and parse out the rails session ID. Use that as the name of a
directory where the user’s uploaded file is stored. Then, when rails is
to process the file you just have to match the current session id to
directory. Very lightweight.
Would that give you the multi-threaded upload?
Yep, but remember, just making something multi-threaded doesn’t
solve all your problems. All computers eventually have a finite level
concurrency, and there’s always a point where just one more request can
literally break the server’s back. Where this breaking point is can be
concurrent or 10 million.
What you should do is test your current setup and see if that meets your
needs. There’s no point in getting paranoid and spending the next 3
rewriting your whole app in Mongrel only. If what you have now meets
performance requirements (you do have measurable performance
right?) then don’t bother.
Once you see that particular Rails actions don’t meet your needs, make a
plan to try something else, but don’t just grab for a Mongrel handler.
Figure out how far off your Rails action is from your needs (you do have
measurable performance requirements right?) and then develop a set of
possible solutions that might meet those needs. Do a few small
to see if your proposed solutions will meet the requirements (you do
measurable performance requirements right?). Then pick the solution
does the job.
Also, after you’ve written the solution, retest everything and
verify. Don’t believe that your own shit don’t smell. I’ve seen people
spend months making something they thought was really fast only to find
it gave them no statistically significant improvement. I myself have
tons of potentially lost development time by testing a potential
before investing, and by testing as I go I avoid going down bad paths
Basing your decisions on evidence is always much better than just
reaching for what everyone else is doing.
Zed A. Shaw