Mongrel and proxying with Apache 2 mod_proxy

Hi all,

I’ve just spent some time going through trying to get mongrel & apache
2.0 working with mod_proxy. I thought I’d share what I’ve found in
case someone is going through the same exercise.

Of late, I’ve been using mongrel (lovely, btw) to run my rails apps.
Since I have a couple of different rails apps running on the machine,
I thought it be nice to set up a simple proxy to each of the apps so
folk don’t have to remember port numbers etc.

Since Apache 2.0 is already on the box and being used for other bits
as well, it made sense to set up mod_proxy to proxy the rails
requests.

This mostly worked without a hitch. I came to grief with the redirected
urls.

This surprised me. A lot. I’d used mod_proxy with other apps and
redirection was never an issue. Why now?

I googled and found that other people were running into similar
issues. Some resolved the issue by using mod_rewrite. Not my
preferred solution. lighttp is not currently under consideration.

Here’s what I’ve found. I’ll start with the 3 things that, as I
understand it, should happen during a proxied request. A proxy
service will need to perform at least the first item.

The second and third could be implemented by the application.
However, this makes it more difficult to have the application proxied
without the involvement of the developers of the application.

Ideally, the application should be blissfully unaware that it is being
proxied. :slight_smile:

The Setup

I have a rails application on super secret rails server.
Addressable from inside by http://super.secret.com:3000. This URL is
not accessible to the outside world. After all, it is super secret.
DNS lookups will fail horribly on super.secret.com. (Yes, I know that
in the real world it’s registered … just pretend for the sake of
this example that it’s not. :slight_smile:

Now I want to make this available to the outside world with a proxy.
The outside world will be able to use this particular application via
the URL: http://coolapps.com/nice/

Expectation

People can run this new super secret application via the
http://coolapps.com/nice/ URL. No one should even be aware of the
actual super.secret.com:3000 urls exist.

I would expect that:

  • All urls would work
  • redirects work. Typically used during authentication pages.
  • stylesheets, images and javascript referenced in the webpage with
    /stylesheets/main.css get loaded properly.

What has to happen during a proxied request to meet the expectations?

Recall that any redirects (like those done during authentication) are
initiated
by the server, passed back to the browser with a 302 code and the
browser
then redirects to that new page. This new URL would also be proxied.

Ditto with images, javascript and stylesheets. These are all requested
by the
browser and need to be proxied as well.

  1. Take a request from the user (browser) and forward it off to the
    actual machine(s) running the application.

    ** The proxy service needs to at least provide this. **

  2. Make sure the http header URLs are re-written to reflect the, in
    our case, http://coolapps.com/nice/ URL space rather than the
    http://super.secret.com:3000 URL space. This is reflected in what we
    see in the browser URL bar.

This includes redirected URLs as well. A redirection code gets sent
back to the browser along with a http header URL and it is the
browser that then makes another request for the new URL we’re being
redirected to.

 ** This can be done by either the proxy service at coolapps.com

or by the application itself. **

  1. Rewrite urls in the actual pages themselves. Why? URLs in our
    rails applications typically appear as “/profile/view” or
    “/images/edit/1”. Notice the leading slash in the “href” above.

If the URL in the browser URL bar, the base URL, were simply
http://coolapps.com/nice/, when we click on “/profile/view” we’d go to
http://coolapps.com/profile/view. Which is exactly where we don’t
want to be.

Our application fully resides under the coolapps.com/nice/ URL space,
so the URL should be coolapps.com/nice/profile/view. See the
difference? This is why the URLs in the page need to be re-written to
“/nice/profile/view” and “/nice/images/edit/1”.

 ** This can be done by either the proxy service at coolapps.com
    or by the application itself. **

Why is it so difficult to use Apache 2.x mod_proxy?

First off, the Apache mod_proxy module works by having the request
come back through the proxy and then returning the rewriten URLs in
both the http headers and the web page itself to the user.

This allows the application to be blissfully ignorant of the fact it’s
being proxied.

Some proxy services may work by telling the application to return
directly to the user who initiated the request rather than back
through the proxy. In this case, the application would need to be
responsible for the rewriting of the http header urls and urls
contained in the content of the page as well.

I’m not sure that a single application could easily deal with both
types of proxying simultaneously.

It seems that rails is somewhat aware that it is being proxied and
this may be causing grief when trying to proxy using Apache’s
mod_proxy.

In particular, rails seems to be trying to address #2 above, but only
when a redirect is requested, which is odd. The URLs come back
correctly on un-redirected pages, but are wrong during redirects.

I think the relavent code in there was done to resolve a proxy issue
with lighttp. Unfortunately, that same code seems to not work with
Apache 2.x mod_proxy.

Hacky Solution

I hacked actionpack-1.12.3/lib/action_controller/cgi_process.rb
host_with_port method so that it didn’t look at the
HTTP_X_FORWARDED_HOST at all. The apache configuration for
coolapps.com includes the following proxy directives

ProxyPass /nice/ http://super.secret.com:3000/

<Location /nice/>
ProxyPassReverse /
SetOutputFilter proxy-html
ProxyHTMLURLMap / /nice/
ProxyHTMLURLMap /nice /nice

Now I’m happily proxying with apache 2.0. Looking forward to trying
my luck with Apache 2.2 with mod_proxy_balanced. I suspect there’s a
similar issue there with redirects.

The hack I’ve done to the cgi_process.rb most likely breaks
compatibility with lighttp. :frowning:

This is an ugly solution. I’m looking for a cleaner one. :slight_smile:

Rick

This has been covered quite a bit actually.
I wrote a plugin to get around this issue. It’s primarily designed to
work
on dumb webservers like IIS that can’t handle the proxying correctly. It
will handle your issue as well.

http://www.napcsweb.com/blog/2006/06/30/reverse-proxy-fix-for-rails-version-102-now-available/

Or you can make sure that Apache sends the x-forwarded-for header to
Mongrel.

If you can, your best approach is going to be to use Apache 2.2 with
mod_proxy_balancer. That seems to handle the forwarding without issues.

Good luck.

Hi,

Maybe I missed something in your description. Here is how I have my
virtual host work here:

<VirtualHost *>
DocumentRoot /sites/symetrie.com/public
ServerName symetrie.com
ServerAlias www.symetrie.com

 AddDefaultCharset utf-8
 ServerAdmin webmaster@localhost
 ErrorDocument 500 /500.html
 ErrorDocument 404 /404.html

 ProxyRequests Off
 ProxyPreserveHost On

 ProxyPass /images !
 ProxyPass /stylesheets !
 ProxyPass /javascripts !
 ProxyPass / http://127.0.0.1:3000/
 ProxyPassReverse / http://127.0.0.1:3000/

 <Proxy *>
     Allow from .symetrie.com
 </Proxy>

 <Directory /sites/symetrie.com/public>
     Options +FollowSymLinks
     Order allow,deny
     allow from all
 </Directory>

 LogLevel warn

 CustomLog /sites/symetrie.com/log/apache.log combined

Tell me if such a vhost is working for you.

Jean-Christophe M.

Symétrie, édition de musique et services multimédia
30 rue Jean-Baptiste Say
69001 LYON (FRANCE)
tél +33 (0)478 29 52 14
fax +33 (0)478 30 01 11
web www.symetrie.com

On Sat, 2006-08-07 at 19:52 -0500, Brian H. wrote:

This has been covered quite a bit actually.
I wrote a plugin to get around this issue. It’s primarily designed to
work on dumb webservers like IIS that can’t handle the proxying
correctly. It will handle your issue as well.

Hi Brian,

Thanks for the plugin. Unfortunately, it will not suit my needs. I
notice that the plugin needs the BASE_URL configured which appears to
mean that the application has knowledge about who’s proxying it.

This does not work for me.

There may be more than one proxy hitting the application. There may be
proxies configured for the application that I have no knowledge about.

The nice this about Apache 2.x proxying is that the proxy is smart. It
can do all the necessary rewriting of urls both in the http headers and
body. Thus, there can be apache proxies out there hitting my
application that I never have to be concerned with. (It’s a nice
separation of concerns).

It looks like Rails is trying to deal with both smart and not-so-smart
proxies at the same time. It’d be nice if there were a smart proxy mode
in which case rails just behaves as a rails application and doesn’t try
to rewrite urls (or require routing changes) to deal with not-so-smart
proxies.

BTW, attached is the small patch I made to the cgi_process.rb so that
Apache 2 proxying just worked. Note that I explicitly took out the
X_FORWARDED_SERVER handling. (ie. the application doesn’t need to
know … just let the proxy server handle the re-writing).

This did not require the use of any plugins either.

WARNING: This does break not-so-smart proxies!

Rick