Ruby Forum Ruby-core > net/http.rb, patch, possible bug, test problem

Posted by Hugh Sasse (Guest)
on 04.10.2007 21:52
(Received via mailing list)
I have tried to patch 1.9 lib/net/http.rb to encourage the use of
compression by default.  Most ruby applications using the HTTP library
just use the methods as given, and assume they do the right thing
(which is fair enough).  However, we could be kinder to the serving
sites by using less bandwidth if we request compression by default.
I have written the patch below, which was originally done for 1.8
[in case this all sounds familiar :-)] which adds a header:
  Accept-Encoding: deflate;q=1.0,identity;q=0.5
unless there is already an Accept-Encoding header already defined.
It does this at the time the request is despatched.

However, I am having problems proving that this is working, since even
the www.w3.org server doesn't seem to set Content-Encoding: deflate
as a result.  I'd appreciate help in getting this to work, iff it is
agreed that this is a good thing.  I'm pretty confident that the header
is being set correctly, see the (now commented out) each_header 
statement
I added

I've also found that there is a variable which seems to be used only
once:

/scratch/hgs/local/lib/ruby/1.9/net/http.rb:1137: warning: instance 
variable @sspi_enabled not initialized

in the sspi_auth?() method, but not in sspi_auth() which may lead to a 
bug.

All this is against the snapshot grabbed earlier today.
brains hgs 139 %> ls -ld ruby-1.9-today.tar.gz ; md5sum 
ruby-1.9-today.tar.gz
-rw-r--r--   1 hgs      staff    5482115 Oct  3 20:03 
ruby-1.9-today.tar.gz
49229dbe30734dee86db90d6c6994520  ruby-1.9-today.tar.gz
brains hgs 140 %>

        HTH
        Hugh

--- http.rb.orig  2007-09-24 08:55:41.000000000 +0100
+++ http.rb  2007-10-04 20:17:21.128839000 +0100
@@ -27,6 +27,7 @@

 require 'net/protocol'
 require 'uri'
+require 'zlib'

 module Net   #:nodoc:

@@ -287,6 +288,7 @@
     Revision = %q$Revision: 13501 $.split[1]
     HTTPVersion = '1.1'
     @newimpl = true
+    @compression = nil  # If compression is being used, and what type.
     # :startdoc:

     # Turns on net/http 1.2 (ruby 1.8) features.
@@ -1046,6 +1048,22 @@
     # This method never raises Net::* exceptions.
     #
     def request(req, body = nil, &block)  # :yield: +response+
+      # Need to handle compression transparently, for efficiency
+      # reasons.  See:
+      # http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.3
+      # which decribes the Accept-Encoding header, and articles such 
as:
+      # http://www.infoworld.com/article/04/07/30/31OPconnection_1.html
+      # for why this is important.
+      # Note, deflate is easier for us to do without a similar 
interface
+      # to g[un]zip.
+      if Get === req
+        unless req.key?("Accept-Encoding")
+          req.add_field("Accept-Encoding", 
"deflate;q=1.0,identity;q=0.5")
+          req.each_header{|a,b| printf "%20s: %-40.40s\n", a,b}
+          @compression = "zlib"
+        end
+      end
+
       unless started?
         start {
           req['connection'] ||= 'close'
@@ -1072,9 +1090,13 @@
       begin
         res = HTTPResponse.read_new(@socket)
       end while res.kind_of?(HTTPContinue)
-      res.reading_body(@socket, req.response_body_permitted?) {
+      res.reading_body(@socket, req.response_body_permitted?) do
+        if @compression and (res["Content-Encoding"] =~ /deflate/i)
+           res.inflate!
+           @compression=nil
+        end
         yield res if block_given?
-      }
+      end
       end_transport req, res
       res
     end
@@ -2250,6 +2272,16 @@
       @body
     end

+    # Inflates the body.  Use if the body was compressed with
+    # deflate [RFC1951].
+    #--
+    # Marked as dangerous, but is there a way to test the body is
+    # in a deflated state?  So we don't do it [twice] by mistake.
+    #++
+    def inflate!()
+      @body=Zlib::Inflate.inflate(@body);
+    end
+
     # Returns the entity body.
     #
     # Calling this method a second or subsequent time will return the
Posted by Martin Duerst (Guest)
on 05.10.2007 23:36
(Received via mailing list)
At 04:52 07/10/05, Hugh Sasse wrote:

>However, I am having problems proving that this is working, since even
>the www.w3.org server doesn't seem to set Content-Encoding: deflate
>as a result.

I worked on the W3C team for over 7 years. I don't remember our
Web site using compression.

>I'd appreciate help in getting this to work, iff it is
>agreed that this is a good thing.  I'm pretty confident that the header
>is being set correctly, see the (now commented out) each_header statement

Please do not assume that every server will automatically use 
compression
if you ask for it. Indeed, most servers won't. In order to support
compression, a server either has to compress on the fly (expensive)
or has to cache a compressed version in parallel (tricky). So I suggest
you test against a server where you know it's set up to support 
compression.

Sending out the header may still be a good thing to take advantage of
compression in those cases where it is available, but it has to be
weighted against the probability of compression actually being used.

Regards,    Martin.


#-#-#  Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University
#-#-#  http://www.sw.it.aoyama.ac.jp       mailto:duerst@it.aoyama.ac.jp
Posted by Eric Hodel (Guest)
on 06.10.2007 00:36
(Received via mailing list)
On Oct 4, 2007, at 21:33 , Martin Duerst wrote:
> At 04:52 07/10/05, Hugh Sasse wrote:
>> I'd appreciate help in getting this to work, iff it is
>> agreed that this is a good thing.  I'm pretty confident that the  
>> header
>> is being set correctly, see the (now commented out) each_header  
>> statement
>
> Please do not assume that every server will automatically use  
> compression
> if you ask for it.

Hugh's patch doesn't assume.

> Indeed, most servers won't. In order to support compression, a  
> server either has to compress on the fly (expensive) or has to  
> cache a compressed version in parallel (tricky).

Net::HTTP is a client, not a server, so what browsers implement is
much more relevant in deciding behavior.

Netscape Navigator 3+, Mozilla, Firefox, Internet Explorer 4+, Opera 5
+, and Safari all ask for compressed content via Accept-Encoding.  I
think that makes a valid case for Net::HTTP also asking for
compressed responses.

> So I suggest you test against a server where you know it's set up  
> to support compression.

I would be surprised if anyone would send a patch to this list
without testing it first, especially Hugh.

> Sending out the header may still be a good thing to take advantage of
> compression in those cases where it is available, but it has to be
> weighted against the probability of compression actually being used.

The only way to get compressed content is to ask for it.
Posted by Martin Duerst (Guest)
on 06.10.2007 05:08
(Received via mailing list)
At 14:30 07/10/05, Eric Hodel wrote:
>> if you ask for it.
>
>Hugh's patch doesn't assume.

Of course not. But his mail was written as if it did. At least that's
what I read from his sentence:

"However, I am having problems proving that this is working, since even
the www.w3.org server doesn't seem to set Content-Encoding: deflate
as a result."

I'm sorry if that was a misunderstanding.

>> Indeed, most servers won't. In order to support compression, a  
>> server either has to compress on the fly (expensive) or has to  
>> cache a compressed version in parallel (tricky).
>
>Net::HTTP is a client, not a server, so what browsers implement is  
>much more relevant in deciding behavior.
>
>Netscape Navigator 3+, Mozilla, Firefox, Internet Explorer 4+, Opera 5 +, and Safari all ask for compressed content via Accept-Encoding.  I  
>think that makes a valid case for Net::HTTP also asking for  
>compressed responses.

Yes, that's a good point.

>> So I suggest you test against a server where you know it's set up  
>> to support compression.
>
>I would be surprised if anyone would send a patch to this list  
>without testing it first, especially Hugh.

Well, he was asking for help,

>> Sending out the header may still be a good thing to take advantage of
>> compression in those cases where it is available, but it has to be
>> weighted against the probability of compression actually being used.
>
>The only way to get compressed content is to ask for it.

Technically, not true. The HTTP spec says:
"If no Accept-Encoding field is present in a request, the server MAY
   assume that the client will accept any content coding."
Some really big files might always be sent with some compression.
But in most cases, adding an encoding header won't hurt. BTW,
any particular reason why we don't also include gzip and compress?

Regards,   Martin.


#-#-#  Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University
#-#-#  http://www.sw.it.aoyama.ac.jp       mailto:duerst@it.aoyama.ac.jp
Posted by Hugh Sasse (Guest)
on 06.10.2007 05:54
(Received via mailing list)
On Fri, 5 Oct 2007, Martin Duerst wrote:

> At 04:52 07/10/05, Hugh Sasse wrote:
> 
> >However, I am having problems proving that this is working, since even
> >the www.w3.org server doesn't seem to set Content-Encoding: deflate
> >as a result.
> 
> I worked on the W3C team for over 7 years. I don't remember our
> Web site using compression.

Oh, OK.  I thought they would demonstrate full functionality in their
servers, as an example of how things should work.  Thanks.
> 
> >I'd appreciate help in getting this to work, iff it is
> >agreed that this is a good thing.  I'm pretty confident that the header
> >is being set correctly, see the (now commented out) each_header statement
> 
> Please do not assume that every server will automatically use compression
> if you ask for it. Indeed, most servers won't. In order to support

Agreed.  Many won't, but I'm having a job finding any that do,
despite people saying that this, and the lack of Etag support is
making RSS (for example) more painful than it need be for servers.

> compression, a server either has to compress on the fly (expensive)
> or has to cache a compressed version in parallel (tricky). So I suggest

Yes, I've not figured out how to get deflate working correctly with
Apache, or I'd test against the server on the Unix machine I look
after.  I also looked at Webrick, but at the moment can't see how to
add this to webrick, so I'd like to start testing against known-to-be-
correct-and-functional servers.

> you test against a server where you know it's set up to support compression.

I've tried Google, BBC, Microsoft, and one or two places that serve 
blogs
discussing this sort of thing.  Is there a way to query a server to ask
it whether it supports compression, so I can tell if my implementation 
is
wrong as opposed to them not supporting it?

> 
> Sending out the header may still be a good thing to take advantage of
> compression in those cases where it is available, but it has to be
> weighted against the probability of compression actually being used.

The more requests for it that get logged, the greater the (slight)
pressure to implement it in servers.
> 
> Regards,    Martin.
> 
        Thank you,
        Hugh
Posted by Hugh Sasse (Guest)
on 06.10.2007 06:18
(Received via mailing list)
On Fri, 5 Oct 2007, Martin Duerst wrote:

> >> compression
> >> if you ask for it.
> >
> >Hugh's patch doesn't assume.
> 
> Of course not. But his mail was written as if it did. At least that's
> what I read from his sentence:
> 
> "However, I am having problems proving that this is working, since even
> the www.w3.org server doesn't seem to set Content-Encoding: deflate
> as a result."

I'm trying to distinguish between a bug in my implementation that I've
failed to see, and the server not implementing it.  So until I find
a server known to do this, I can't be certain what is happening.
> 
> I'm sorry if that was a misunderstanding.
> 
That's OK.  I have sometimes posted untested code when it is at the
"wet paint" stage, so people wiser than me can say "You REALLY don't
want to do that" before I spend hours testing it.
        [....]
> >> So I suggest you test against a server where you know it's set up  
> >> to support compression.
> >
> >I would be surprised if anyone would send a patch to this list  
> >without testing it first, especially Hugh.
> 
> Well, he was asking for help,

I've tested it, and found one typo which I fixed [s/Enocding/Encoding/;]
but I'm a bit stuck, and would appreciate more eyes over the code.
> 
> >> Sending out the header may still be a good thing to take advantage of
> >> compression in those cases where it is available, but it has to be
> >> weighted against the probability of compression actually being used.
> >
> >The only way to get compressed content is to ask for it.
> 
> Technically, not true. The HTTP spec says:
> "If no Accept-Encoding field is present in a request, the server MAY
>    assume that the client will accept any content coding."

Yes, that's right. Not really in the spirit of "generous in what
you accept, conservative in what you send", but there it is.

> Some really big files might always be sent with some compression.
> But in most cases, adding an encoding header won't hurt. BTW,
> any particular reason why we don't also include gzip and compress?

Limitations in my understanding of the library: gzip seems to
require setup at the beginning and wrap up of some kind at the end,
and at the moment I'm not absolutely sure how to do that correctly.
Also, I think it needs a stream, but I've not looked at it recently.
Compress is another matter: I don't think Ruby supports that out of
the box, and Linux implementations are unlikely to because of the
Unisys patent issue, only recently closed [I think].  I wanted this
to work out of the box with Ruby, so can't rely on external
programs.  So I thought it best to start with Deflate.  The
@compression variable I added was designed deliberately to allow
future expansion to {compress, gzip, bzip2, ...}, that's why it's
a string (though I suppose it could equally be a Symbol).

> 
> Regards,   Martin.
> 
        Thank you
        Hugh
Posted by Martin Duerst (Guest)
on 06.10.2007 06:36
(Received via mailing list)
At 19:50 07/10/05, Hugh Sasse wrote:
>
>Oh, OK.  I thought they would demonstrate full functionality in their
>servers, as an example of how things should work.  Thanks. 

We tried to do a lot of this, but we didn't get everything done.
Many of the pages we had were rather small (they didn't contain
lots of useless <FONT> tags and other bloat to begin with,...),
which made compression less of an issue. Also, our server was
very much just a file system, you committed something via CVS,
more often just e.g. from Amaya via PUT. Doing compression on
the fly or cached would have been a high overhead.
[all this is still pretty much the same, as far as I know]


>Yes, I've not figured out how to get deflate working correctly with 
>Apache, or I'd test against the server on the Unix machine I look
>after.

I haven't done this in a while, and I only did it for language,
not for encoding, but here are some pointers that might be helpful.
(you can set things up statically for testing, and this fits the
language situation, because dynamic/automatic translation of
course isn't ready for prime time).

http://www.w3.org/International/questions/qa-apache-lang-neg
(negotiation for encoding works exactly the same)
http://httpd.apache.org/docs/1.3/mod/mod_mime.html#addencoding


>> you test against a server where you know it's set up to support compression.
>
>I've tried Google, BBC, Microsoft, and one or two places that serve blogs
>discussing this sort of thing.  Is there a way to query a server to ask
>it whether it supports compression, so I can tell if my implementation is
>wrong as opposed to them not supporting it?

There are some negotiation-related headers, such as Vary,
that you might want to look for. But I haven't had time to
re-read that negotiation stuff just now.

Regards,    Martin.


#-#-#  Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University
#-#-#  http://www.sw.it.aoyama.ac.jp       mailto:duerst@it.aoyama.ac.jp
Posted by Hugh Sasse (Guest)
on 06.10.2007 07:16
(Received via mailing list)
On Fri, 5 Oct 2007, Martin Duerst wrote:

> >> Web site using compression.
> >
> >Oh, OK.  I thought they would demonstrate full functionality in their
> >servers, as an example of how things should work.  Thanks. 
> 
> We tried to do a lot of this, but we didn't get everything done.
> Many of the pages we had were rather small (they didn't contain
> lots of useless <FONT> tags and other bloat to begin with,...),

  :-)  Most of my html is pretty basic.

> which made compression less of an issue. Also, our server was
> very much just a file system, you committed something via CVS,
> more often just e.g. from Amaya via PUT. Doing compression on
> the fly or cached would have been a high overhead.
> [all this is still pretty much the same, as far as I know]
> 
Thank you.  I've used Amaya as a browser, not really edited with it.
> 
> http://www.w3.org/International/questions/qa-apache-lang-neg
> (negotiation for encoding works exactly the same)
> http://httpd.apache.org/docs/1.3/mod/mod_mime.html#addencoding
> 

I don't normally look after the Apache server, my colleague does
that.  Can anyone tell me how the AddType and AddEncoding directives
interact for Apache2?  The httpd.conf says to use AddType if you
don't have AddEncoding, but not what will happen if you have both.
So before I munge our server... :-)
> 
> >> you test against a server where you know it's set up to support compression.
> >
> >I've tried Google, BBC, Microsoft, and one or two places that serve blogs
> >discussing this sort of thing.  Is there a way to query a server to ask
> >it whether it supports compression, so I can tell if my implementation is
> >wrong as opposed to them not supporting it?
> 
> There are some negotiation-related headers, such as Vary,

Yes, I saw a Vary header.  I'll have to look that up.

> that you might want to look for. But I haven't had time to
> re-read that negotiation stuff just now.
> 
> Regards,    Martin.

        Thank you,
        Hugh
Posted by Hugh Sasse (Guest)
on 06.10.2007 08:55
(Received via mailing list)
On Fri, 5 Oct 2007, Martin Duerst wrote:

> I haven't done this in a while, and I only did it for language,
> not for encoding, but here are some pointers that might be helpful.
> (you can set things up statically for testing, and this fits the
> language situation, because dynamic/automatic translation of
> course isn't ready for prime time).
> 
> http://www.w3.org/International/questions/qa-apache-lang-neg
> (negotiation for encoding works exactly the same)
> http://httpd.apache.org/docs/1.3/mod/mod_mime.html#addencoding
> 

I have a .htaccess file now:


AddEncoding x-deflate .zz

But this doesn't seem to be doing the trick.  It is enabled on our 
server
and the file is 644 owned by me.

The URLs are:

http://www.eng.cse.dmu.ac.uk/~hgs/

which will access index.html by default, but I also have

http://www.eng.cse.dmu.ac.uk/~hgs/index.html.zz

which is the deflated version

and

http://www.eng.cse.dmu.ac.uk/~hgs/.htaccess

though I get 403 Forbidden if I try to access that.  But only apache 
needs
to see that, not people outside?

        Hugh
Posted by Christopher Boumenot (Guest)
on 06.10.2007 09:08
(Received via mailing list)
[snip]

 > Limitations in my understanding of the library: gzip seems to
 > require setup at the beginning and wrap up of some kind at the end,
 > and at the moment I'm not absolutely sure how to do that correctly.
 > Also, I think it needs a stream, but I've not looked at it recently.
 > Compress is another matter: I don't think Ruby supports that out of
 > the box, and Linux implementations are unlikely to because of the
 > Unisys patent issue, only recently closed [I think].  I wanted this
 > to work out of the box with Ruby, so can't rely on external
 > programs.  So I thought it best to start with Deflate.  The
 > @compression variable I added was designed deliberately to allow
 > future expansion to {compress, gzip, bzip2, ...}, that's why it's
 > a string (though I suppose it could equally be a Symbol).

I had a similar desire - do not download more data than necessary.  I
ended up using gzip by just re-opening the HTTP(S) class.

-----

require 'net/https'
require 'stringio'
require 'zlib'

module Net
     class HTTP
         alias original_get get
         def get(path, initheader = nil, dest = nil, &block)
             headers = { 'Accept-Encoding' => 'gzip' }
             headers.merge!(initheader) unless initheader.nil?

             return original_get(path, headers, dest, &block)
         end
     end

     class HTTPResponse
         def gzip?
             return false unless @header.has_key?('content-encoding')
             return @header['content-encoding'].to_s == 'gzip'
         end

         alias original_body body
         def body
             data = original_body
             return (gzip?) ?
Zlib::GzipReader.new(StringIO.new(data)).read : data
         end
     end
end
Posted by Hugh Sasse (Guest)
on 06.10.2007 09:29
(Received via mailing list)
On Fri, 5 Oct 2007, Christopher Boumenot wrote:

> [snip]
> 
> > Limitations in my understanding of the library: gzip seems to
> > require setup at the beginning and wrap up of some kind at the end,
        [...]
> 
> I had a similar desire - do not download more data than necessary.  I ended up
> using gzip by just re-opening the HTTP(S) class.
> 
Thank you, I'll look at this code in due course, but that looks 
relatively
easy to do.
> -----
> 
> require 'net/https'
> require 'stringio'
> require 'zlib'
> 
> module Net
>     class HTTP
>         alias original_get get

Yes, this looks like a better place for this.

>             return false unless @header.has_key?('content-encoding')
>             return @header['content-encoding'].to_s == 'gzip'
>         end

Good idea, I think I'll modify that to def compression? and return a 
string.
> 
>         alias original_body body
>         def body
>             data = original_body
>             return (gzip?) ? Zlib::GzipReader.new(StringIO.new(data)).read :
> data

OK, that looks doable.  Any idea if GZipReader will read compressed 
(i.e.,
.Z) files as gzip itself will?
>         end
>     end
> end
> 

Could you tell me of >= 1 servers that do make use of this then, so I 
can
test the code when I've modified this again?

        Hugh
Posted by Tanaka Akira (Guest)
on 06.10.2007 10:39
(Received via mailing list)
In article <fe5g6i$2cd$1@sea.gmane.org>,
  Christopher Boumenot <boumenot@gmail.com> writes:

> Zlib::GzipReader.new(StringIO.new(data)).read : data
>          end
>      end

It seems that foo.tar.gz is automatically decompressed if
Apache is configured as follows.

  AddEncoding gzip .gz

It is confusing.  Especially if downloaded result is stored
in a file which name is "foo.tar.gz".
Posted by Hugh Sasse (Guest)
on 06.10.2007 11:16
(Received via mailing list)
On Sat, 6 Oct 2007, Tanaka Akira wrote:

> In article <fe5g6i$2cd$1@sea.gmane.org>,
>   Christopher Boumenot <boumenot@gmail.com> writes:
> 
> >      class HTTPResponse
> >          def gzip?
        [The content-encoding is gzip]
> >          end
> >
> >          alias original_body body
> >          def body
        [ungzip the content if the content-encoding header says so]
> >          end
> >      end
> 
> It seems that foo.tar.gz is automatically decompressed if
> Apache is configured as follows.
> 
>   AddEncoding gzip .gz
> 
> It is confusing.  Especially if downloaded result is stored
> in a file which name is "foo.tar.gz".

Agreed.  I think this must be a consequence of how Apache seems to
implement this facility.  Basically, a file with no extension is
used as a stem to search for the .gz version (etc) of the file,in
order to find an equivalent to send back, with "content-encoding"
set.  I think this is a bug in Apache, because if a gzipped file had
been requested, then that's a property of the file rather than how
it is encoded for transmission.   But I'm still on the learning
curve for this.  Are there any Apache experts out there who can tell
me whether I'm right, and would anyone be willing to file a bug with
the Apache implementors if I am?  Much better that it comes from
someone whom they know understands...

> -- 
> Tanaka Akira
> 
        Hugh
Posted by Christopher Boumenot (Guest)
on 06.10.2007 14:53
(Received via mailing list)
[snip]

> OK, that looks doable.  Any idea if GZipReader will read compressed (i.e., 
> .Z) files as gzip itself will? 

I do not think so.  Compress is a different format.

> Could you tell me of >= 1 servers that do make use of this then, so I can 
> test the code when I've modified this again?

I wrote the code for my smugmug library. So smugmug.com is one, and I
just checked msn.com, and cnn.com and they both support gzip.  Actually
there is a very easy way to find out especially if you have curl.

   $ curl -H 'Accept-Encoding: gzip' http://www.cnn.com | gunzip

If gunzip says "stdin: not in gzip format" you know the server does not
support gzip.  (I cannot get that trick to work for 
compress/uncompress.)


Christoper
Posted by Christopher Boumenot (Guest)
on 06.10.2007 15:06
(Received via mailing list)
[snip]

> someone whom they know understands...
Hopefully, I can clarify this.  Apache will compress files on the fly if
told to do so.  For example, you could add this to your httpd.conf file,
and it will compress all files with MIME type text/html, text/plain, and
text/xml assuming mod_deflate has been loaded.

AddOutputFilterByType DEFLATE text/html text/plain text/xml

If I was to fetch any files of that type Apache will automatically
compress them using gzip if the HTTP header Accept-Encoding is set to 
gzip.

In my testing mod_deflate does not imply that deflate compression is
actually supported.  In fact, when I set Accept-Encoding to deflate
Apache just returns uncompressed data.


Christopher
Posted by Martin Duerst (Guest)
on 07.10.2007 02:33
(Received via mailing list)
At 22:50 07/10/05, Hugh Sasse wrote:
>> http://httpd.apache.org/docs/1.3/mod/mod_mime.html#addencoding
>The URLs are:
>
>http://www.eng.cse.dmu.ac.uk/~hgs/
>
>which will access index.html by default,

The default index file is a step before the content negotiation,
as far as I know, so in the end, the deflated file
should be served at this address, too. But better try
to work it out for http://www.eng.cse.dmu.ac.uk/~hgs/index.html.

>but I also have
>
>http://www.eng.cse.dmu.ac.uk/~hgs/index.html.zz
>
>which is the deflated version

I tried this and http://www.eng.cse.dmu.ac.uk/~hgs/index.html
with http://www.rexswain.com/cgi-bin/httpview.cgi.

 From that and your mail, it look like you haven't yet
activated content negotiation. Can you check that
MultiViews is enabled?
http://www.w3.org/International/questions/qa-apache-lang-neg
says it's the default, but it's often switched off because
it makes things slower. Also make sure that mod_negotiation
is switched on (and restart your server if you change that).

>http://www.eng.cse.dmu.ac.uk/~hgs/.htaccess
>
>though I get 403 Forbidden if I try to access that.  But only apache needs
>to see that, not people outside?

Correct, and it's usually forbidden to not reveal too much
about your server setup, for security reasons.


Regards,    Martin.s


#-#-#  Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University
#-#-#  http://www.sw.it.aoyama.ac.jp       mailto:duerst@it.aoyama.ac.jp
Posted by Martin Duerst (Guest)
on 07.10.2007 02:34
(Received via mailing list)
At 01:13 07/10/06, Hugh Sasse wrote:
>On Sat, 6 Oct 2007, Tanaka Akira wrote:

>> It seems that foo.tar.gz is automatically decompressed if
>> Apache is configured as follows.
>> 
>>   AddEncoding gzip .gz

Decompressed on the server side or on the client side?
On the server side, this could happen if the server got a
request that didn't include the gzip encoding, but didn't
have an unencoded version of the file. On the client side,
that would happen because transfer encoding is what it says,
an encoding used for transfer.

>> It is confusing.  Especially if downloaded result is stored
>> in a file which name is "foo.tar.gz".
>
>Agreed.  I think this must be a consequence of how Apache seems to
>implement this facility.  Basically, a file with no extension is
>used as a stem to search for the .gz version (etc) of the file,in
>order to find an equivalent to send back, with "content-encoding"
>set.

This is only one way to do it. Apache doesn't force you to do it that
way, but in many cases, it's quite convenient.

>I think this is a bug in Apache, because if a gzipped file had
>been requested, then that's a property of the file rather than how
>it is encoded for transmission.

You can't use the same file as part of a negotiation (with AddEncoding)
and independently (with AddType, as application/gzip (not registered)
or application/x-gzip or some such).

In the former case, you say that the encoding is part of the transfer,
and the file is just a precomputed version. Clients probably will unpack
it. Content-Type should be taken from the uncompressed version, and I
think Apache will automatically do that e.g. for foo.html.gz or so.

In the later case, the file is just 'something compressed' which you
want to transfer as such. For .tar.gz, probably that makes more sense
than using a transfer encoding.

If you have both cases on your Web site, then you have to do some
fine-grained (per directory or per file) settings.

>But I'm still on the learning
>curve for this.  Are there any Apache experts out there who can tell
>me whether I'm right, and would anyone be willing to file a bug with
>the Apache implementors if I am?

I'm not sure I have understood the exact details of your situation,
but up to now, it doesn't look to me like a bug in Apache. If it
turns out to be a bug in Apache, I'll be glad to file one.

Regards,   Martin.


#-#-#  Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University
#-#-#  http://www.sw.it.aoyama.ac.jp       mailto:duerst@it.aoyama.ac.jp
Posted by Hugh Sasse (Guest)
on 09.10.2007 13:56
(Received via mailing list)
On Sat, 6 Oct 2007, Martin Duerst wrote:

> >set.
> 
> This is only one way to do it. Apache doesn't force you to do it that
> way, but in many cases, it's quite convenient.

That's the way I found out how to do it -- i.e. it seems a popular
technique...
> and the file is just a precomputed version. Clients probably will unpack
> it. Content-Type should be taken from the uncompressed version, and I
> think Apache will automatically do that e.g. for foo.html.gz or so.
> 
> In the later case, the file is just 'something compressed' which you
> want to transfer as such. For .tar.gz, probably that makes more sense
> than using a transfer encoding.
> 
> If you have both cases on your Web site, then you have to do some
> fine-grained (per directory or per file) settings.

So you can't compress a file in transit that is already compressed?
OK, you won't get much saving, but you will get some:

brains hgs 144 %> ls -ld ruby-1.9-today.tar.gz
-rw-r--r--   1 hgs      staff    5483942 Oct  7 20:01 
ruby-1.9-today.tar.gz
brains hgs 145 %> cat ruby-1.9-today.tar.gz | gzip > 
ruby-1.9-today.tar.gz.gz
brains hgs 146 %> ls -ld ruby-1.9-tod*
-rw-r--r--   1 hgs      staff    5483942 Oct  7 20:01 
ruby-1.9-today.tar.gz
-rw-r--r--   1 hgs      staff    5473563 Oct  9 12:36 
ruby-1.9-today.tar.gz.gz
brains hgs 147 %> expr 5483942 - 5473563
10379
brains hgs 148 %>

That would be more significant for a larger file over a slow link, so it
makes sense to be able to distinguish between the four cases, as you 
could
if they were entirely orthogonal.
> Regards,   Martin.
> 
        Hugh