Forum: NGINX 40 bad request and UTF8

2974d09ac2541e892966b762aad84943?d=identicon&s=25 optimum.dulopin (Guest)
on 2013-09-25 14:14
(Received via mailing list)
Hi,

Im using nginx and rails for my site which contains url with georgian
letters ie განცხადებები so something like
http://gancxadebebi.ge/ka/%E1%83%92%E1%83%90%E1%83...
It is mainly working perfectly but sometimes I receive request with
truncated url ie
1 -
http://gancxadebebi.ge/ka/%E1%83%92%E1%83%90%E1%83...
(as u can see it should be something after %9)
or
2 -
http://gancxadebebi.ge/ka/%E1%83%92%E1%83%90%E1%83...

I succeeded to deal when there is no get parameters (first url above)
and
make in that case a redirection to /
when this happen, this line is added to nginx error.log
2013/09/24 00:46:53 [alert] 63547#0: *19359227 pcre_exec() failed: -10
on
"/ka/განცხადებებ�" using "", client: aa.bb.cc.dd, server:
gancxadebebi.ge,
request: "GET
/ka/%E1%83%92%E1%83%90%E1%83%9C%E1%83%AA%E1%83%AE%E1%83%90%E1%83%93%E1%83%94%E1%83%91%E1%83%94%E1%83%91%E1%8
HTTP/1.1", host: "gancxadebebi.ge"

but for second url, which have get parameter truncated, I can not handle
that which generate a 400 bad request page.
such request added this line in nginx access.log
aa.bb.cc.dd - - [24/Sep/2013:00:48:47 +0200] "GET
/ka/%E1%83%92%E1%83%90%E1%83%9C%E1%83%AA%E1%83%AE%E1%83%90%E1%83%93%E1%83%94%E1%83%91%E1%83%94%E1%83%91%E1%83%98?mc=mini+aipadi&search=%E1%83%AB%E1%83%98%E1%83%94%E1%83%91%E1%83%
HTTP/1.1" 400 5 "-" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36
(KHTML, like Gecko) Chrome/29.0.1547.76 Safari/537.36"

does this mean that nginx accepted the request and then rails coudnt
resolve
it ?

I don't know if problem come from rails or from nginx. For first url, I
solved it in nginx conf
here part of my conf

    access_log /var/log/nginx/gancx.access.log;
    error_log /var/log/nginx/gancx.error.log;

    client_body_in_file_only clean;
    client_body_buffer_size 32K;
    charset UTF-8;
    source_charset UTF-8;
    client_max_body_size 300M;



    error_page  400 404         = @notfound;
    error_page  500 502 504 = @server_error;
    error_page  503         = @maintenance;

    location @notfound {
      rewrite ^(.*)$ $scheme://$host permanent;
    }

    location @server_error {
        rewrite ^(.*)$ $scheme://$host permanent;
    }

    location @maintenance {
        rewrite ^(.*)$ $scheme://$host permanent;
    }
    sendfile on;
    send_timeout 300s;

    location / {
        proxy_pass http://gancx;
        proxy_redirect off;

        proxy_set_header   Host             $host;
        proxy_set_header   X-Real-IP        $remote_addr;
        proxy_set_header   X-Forwarded-For  $proxy_add_x_forwarded_for;
        charset UTF-8;
        client_max_body_size 7m;
        proxy_buffer_size          4k;
        proxy_buffers              4 32k;
        proxy_busy_buffers_size    64k;
        proxy_temp_file_write_size 64k;
    }


thanks for your help

Posted at Nginx Forum:
http://forum.nginx.org/read.php?2,243130,243130#msg-243130
A8108a0961c6087c43cda32c8616dcba?d=identicon&s=25 Maxim Dounin (Guest)
on 2013-09-25 16:05
(Received via mailing list)
Hello!

On Wed, Sep 25, 2013 at 08:13:11AM -0400, optimum.dulopin wrote:

> or
> 2 -
>
http://gancxadebebi.ge/ka/%E1%83%92%E1%83%90%E1%83...
>
> I succeeded to deal when there is no get parameters (first url above) and
> make in that case a redirection to /

Hmm, I tend to think it's a bug that (1) doesn't generate 400 Bad
Request.  It should.

> when this happen, this line is added to nginx error.log
> 2013/09/24 00:46:53 [alert] 63547#0: *19359227 pcre_exec() failed: -10 on
> "/ka/განცხადებებ�" using "", client: aa.bb.cc.dd, server: gancxadebebi.ge,
> request: "GET
>
/ka/%E1%83%92%E1%83%90%E1%83%9C%E1%83%AA%E1%83%AE%E1%83%90%E1%83%93%E1%83%94%E1%83%91%E1%83%94%E1%83%91%E1%8
> HTTP/1.1", host: "gancxadebebi.ge"

The -10 from pcre_exec() is PCRE_ERROR_BADUTF8, it shouldn't
happen unless you've explicitly used "(*UTF8)" in your PCRE
patterns.  It's very strange you see it with the config provided.

> but for second url, which have get parameter truncated, I can not handle
> that which generate a 400 bad request page.
> such request added this line in nginx access.log
> aa.bb.cc.dd - - [24/Sep/2013:00:48:47 +0200] "GET
>
/ka/%E1%83%92%E1%83%90%E1%83%9C%E1%83%AA%E1%83%AE%E1%83%90%E1%83%93%E1%83%94%E1%83%91%E1%83%94%E1%83%91%E1%83%98?mc=mini+aipadi&search=%E1%83%AB%E1%83%98%E1%83%94%E1%83%91%E1%83%
> HTTP/1.1" 400 5 "-" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36
> (KHTML, like Gecko) Chrome/29.0.1547.76 Safari/537.36"
>
> does this mean that nginx accepted the request and then rails coudnt resolve
> it ?

By itself nginx doesn't try to urldecode request arguments (in
contrast to URI path, which is urldecoded for location matching),
and because of this it doesn't try to detect encoding violations
in request arguments.  That is, most likely you are right and the
error comes from your backend.

You may try intercepting errors using proxy_intercept_errors, but
actually I wouldn't recommend doing it.  Configuring an error_page
for 400 Bad Request isn't a good idea, it might hurt.

--
Maxim Dounin
http://nginx.org/en/donation.html

p.s. Please don't duplicate the same question to the same mailing
list via multiple forum-like interfaces.  It's still the same
mailing list.  Thank you for cooperation.
Please log in before posting. Registration is free and takes only a minute.
Existing account

NEW: Do you have a Google/GoogleMail, Yahoo or Facebook account? No registration required!
Log in with Google account | Log in with Yahoo account | Log in with Facebook account
No account? Register here.