Problem serving files which start with newlines


#1

I’m working on a rails app that stores files in a database and allows
users do download them again. But when a file starts with newlines they
get lost in the process and the browser marks the download as failed
because the received filesize is smaller then the expected filesize (the
difference is exactly the newlines missing).

To reproduce this place something like this an a controller
def foo
send_data “\nline1\nline2|”
end
and the downloaded file will contain only “line1\nline2|”.

I think I tracked the problem down to webrick_server.rb around line 115.
A StringIO object is created, filled with the data and then this is
split into header and body by calling ‘extract_header_and_body’. Before
this call the newlines are still there and after it they are missing.
‘extract_header_and_body’ uses ‘*data.split(/^[\xd\xa]+/on, 2)’ to split
up header and body and that regular expression seems to swallow not only
the one newline separating header and body but also the newlines the
body containing the file data is starting with. I put the relevant code
below, hopefully making clear what I mean.

  1. Having started with Rails and Ruby only two days ago, I’m not that
    good with the syntax and all, so is my analyse correct or did I get
    something wrong?
  2. Is ‘send_data’ the proper way of doing what I want? (I’m quite sure
    it is, but better be save (and ask) than sorry.)
  3. How can I avoid/work around this?
  4. Since this is in webrick_server.rb am I right to assume that it’s a
    problem in the webrick webserver and that when using e.g. apache this
    will go away?

def handle_dispatch(req, res, origin = nil) #:nodoc:
data = StringIO.new
Dispatcher.dispatch(
CGI.new(“query”, create_env_table(req, origin),
StringIO.new(req.body || “”)),
ActionController::CgiRequest::DEFAULT_SESSION_OPTIONS,
data
)

print data.string					<- linefeeds still there
header, body = extract_header_and_body(data)
print body						<- linefeeds gone
[...]

end
[…]
def extract_header_and_body(data)
data.rewind
data = data.read

raw_header, body = *data.split(/^[\xd\xa]+/on, 2)   <- probably the 

problem
header = WEBrick::HTTPUtils::parse_header(raw_header)

return header, body

end