On Wed, May 2, 2012 at 1:31 PM, Lukas T. [email protected]
wrote:
Its about stale data in the client. If the HTTP client request an exact
byte-range, then this particular byte-range was chosen because of a reason, like
the moov atom in mp4 files; the header of a pdf file or similar stuff (depending
on the content-type). Here is the problem: when the client reads the first bytes
of the PDF today, and the user scrolls down tomorrow (adobe reader makes heavy use
of range requests iirc), the pdf on the server needs to be exactly the same (bit
for bit). If its not, the byte-range request must not be successful, otherwise the
application will get corrupt data (how can the byte offset still be the same with
the file from yesterday if the content changed on the server). This is the reason
why the HTTP server needs to validate the client-side cache with things like
filemtime. If we can’t validate the client cache, we can’t serve 206 partial
content. In case of dynamic content we have no way to do this (theoretically it
would be doable with etag stro
ng validation, but nginx doesn’t support it and your application surely doesn’t
either).
Yeah, this is a reason why it shouldn’t be done in most cases…
technically it’s still feasible, but I guess I’ll accept “shouldn’t”
as “can’t”. We manage to get around this kind of situation by using
checksums to validate the entire HTTP body once it’s been delivered in
its entirety. The checksums are originally for something else
entirely, but that’s a different story.
Iirc, (please correct me if I’m wrong), nginx, when configured as a caching
reverse proxy serves 206 only when the object is already in the local nginx cache.
If its not there, full file will be served.
I don’t think nginx as we have it configured would qualify as a
caching reverse proxy - it’s a pretty standard nginx + passenger
install. I did try it with a static file and Range requests worked as
expected.
Also read HTTP/1.1 specs in [1].
I’ve read it, and am using it for reference, but I don’t see anything
that specifically addresses the question of dynamic content.
Can you tell us more about your use case? Is your dynamic content really that
big? Maybe you are approaching this from the wrong side, x-accel-redirect can
probably help here, as Ensiferous already posted.
Yeah, a bit of context would probably help here. The HTTP client in
this case is (always) a very particular embedded device with limited
resources and a constrained operating environment. A response at the
extreme end of the scale could be 200 K, and this apparently causes
the client to croak (for reasons that are a little unclear to me). It
was suggested that if we could break up the response into smaller
pieces, it may solve the issue. This is just one avenue we’re
exploring.
For the moment, I did get this to work by adapting the Rack middleware
Racknga::Middleware::Range I found here:
http://groonga.rubyforge.org/. It has yet to be seen whether this
will solve the client crashing issues, so for now I’m going to leave
it.
Thanks for your response!