How do I redirect/handle requests from search engines via nginx using location regex

Hi,

I have developed an ajax-based web application with hash bang urls.

I am trying to redirect requests from search engines to another server
which generates HTML snapshots and send the response. I am trying to
achieve this in nginx with the location directive as mentioned below:

  location ~ ^(/?_escaped_fragment_=).*$ {
     proxy_set_header        Host $host;
      proxy_set_header        X-Real-IP $remote_addr;
  proxy_set_header        X-Forwarded-For 

$proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;

      client_max_body_size    10m;
      client_body_buffer_size 128k;
      proxy_connect_timeout   60s;
      proxy_send_timeout      90s;
      proxy_read_timeout      90s;
      proxy_buffering         off;
      proxy_temp_file_write_size 64k;
      proxy_pass      http://x1.x2.x3.x4:8080;
      proxy_redirect      off;
  }

But I am not able to get this working. Can someone correct the regex I
am
using (or) provide me an alternative solution to achieve this.

Thanks in advance,

Ravi

This message contains confidential information and is intended only for
the
individual named. If you are not the named addressee you should not
disseminate, distribute or copy this e-mail. Please notify the sender
immediately by e-mail if you have received this e-mail by mistake and
delete this e-mail from your system. E-mail transmission cannot be
guaranteed to be secure or error-free as information could be
intercepted,
corrupted, lost, destroyed, arrive late or incomplete, or contain
viruses.
The sender therefore does not accept liability for any errors or
omissions
in the contents of this message, which arise as a result of e-mail
transmission. If verification is required please request a hard-copy
version. Pipal Tech Ventures Private Limited (www.pipaltechventures.com)

On Wed, Apr 25, 2012 at 02:10:53PM +0530, Raviteja Dodda wrote:

Hi there,

I have developed an ajax-based web application with hash bang urls.

I am trying to redirect requests from search engines to another server
which generates HTML snapshots and send the response.

I confess I’m not sure how to tie “requests from search engines” and
“hash bang urls” into the details of your question. But that probably
doesn’t matter.

nginx doesn’t care how or why the url arose; it just knows what request
was made to it, and it processes that request in the one location that
is configured for the request.

So I’ll (try to) describe what nginx sees and what nginx does, and then
maybe it will be clear to you how that maps to your application and your
configuration.

In general, the url is of the form

http://one:two/three?four#five

where some of the parts may be missing.

nginx uses the part before the first single / – one:two – to decide
which server{} to use. (That’s not quite true, but it’s close enough
for now.)

nginx then is only sent the part from the first single / to just before
the first # – so all nginx can see is /three?four.

When it comes to matching against the location{}s that are defined,
nginx stops just before the first ? – in this case, /three

So: can you identify the http requests that are made when someone uses
your web application? Or the http requests that come from search
engines?

Then see what is the part that nginx will consider relevant for the
location matching – from the first single / to just before the first #
or ? – and then write a location directive to match those.

I am trying to
achieve this in nginx with the location directive as mentioned below:

  location ~ ^(/?_escaped_fragment_=).*$ {

There is one subtlety, in that it is possible for nginx to consider
? to be part of the url that can be matched in the location directive;
but only if it was properly url-encoded originally (which means that
the client did not ask for /?four, but for something like /%3ffour).

But that’s not a common setup; so I’m going to guess that you are not
using that, in which case the request that it looks like you are trying
to match is exactly “/”.

But I am not able to get this working. Can someone correct the regex I am
using (or) provide me an alternative solution to achieve this.

Hopefully the above gives you enough information to find the answer.

If not, if you can include one url that you are attempting to access
(ideally using something like curl, so that it is easily repeatable
without being concerned with browser caching), and the response you
expect
and the response you get, then someone may be able to provide more help.

f

Francis D. [email protected]