I occasionally need to stream a large XML data file that represents
key data in a DB. I’m porting over an application from PHP Symfony,
and with my initial implementation, it takes around 7 times as long
with rails. Also with Symfony, data begins to download almost as soon
as I invoke the URL, whereas with rails, all data is processed on the
server side before the client gets the first byte. I have a hand-
crafted query to hit the DB, and use fetch_hash to use the raw data
from the mysql gem and that renders extremely quickly. Also I’ve
tried to write a tiny subset of XML while reading the entire
resultSet; with that I get much faster performance, but of course that
way the XML doesn’t come.
I spent most of the past weekend trying to determine how to optimize
this (hoping to do at least as well as PHP symfony) but can’t do it.
I tried:
- used render :text => (lambda do |response, output| … )
- ruby 1.8.7 vs. ruby 1.9.2
- rails 2.3.5 vs. rails 3
- XmlBuilder vs. Nokogiri::XML::Builder
- HAML vs ERB
- passenger vs. script/server
Nothing honestly moved the performance needle in a serious way.
I’ve finally come to the conclusion that rails does not stream out as
I’d expect. Here’s a look at the perf stats rendered as the request
runs:
Rendered hgrants/_request_detail (2.2ms)
Rendered hgrants/_request_detail (3.9ms)
Rendered hgrants/_request_detail (2.4ms)
Rendered hgrants/_request_detail (2.3ms)
Rendered hgrants/_request_detail (242.7ms)
Rendered hgrants/_request_detail (2.2ms)
Rendered hgrants/_request_detail (1.9ms)
Rendered hgrants/_request_detail (1.8ms)
We went from an average 2ms up to 242ms then back down. I saw this
sporadically throughout the 1000 template renderings That suggests to
me that memory is getting garbage collected. Also, I’m invoking the
request from curl, and it reports no data downloaded until after my
logfile tells me rails has finished processing all records in the
view. The model IDs that result in the over-sized ms count vary from
one request to another, so I’m convinced there is nothing in the app
that is doing this. I even tested this by removing the call to the
HAML template and replacing it with a block of generated text and
observed similar behavior.
This is how I’m invoking HAML from the XML Builder template:
xml << render(:partial => ‘hgrants/
request_detail.html.haml’, :locals => { :model => model })
I also tried using this trick to try to get it to stream, but I
observed exactly the same behavior; no data showed up in curl until
all records had been processed.
render :text => (lambda do |response, output|
extend ApplicationHelper
xml = Builder::XmlMarkup.new(
:target => StreamingOutputWrapper.new(output),
:indent => 2)
eval(default_template.source, binding, default_template.path)
end)
(Also, in rails 3, the render :text with a Proc, rails 3 renders the
Proc as a to_str rather than calling it.)
This particular issue I can certainly work around but it’s
disappointing if it’s true that there’s no way in rails to stream
output to the browser for large pages. And particularly disappointing
if PHP/Symfony can outgun rails for streaming. I’ve been using rails
since 2006 and most requests have fairly small responses so maybe the
answer is to defer to a different technology for streaming larger
files. But it seems like there should be a good solution for
streaming data and flushing the output stream.
Any help is greatly appreciated!
Eric