Forum: NGINX Nginx convert UTF8 request to ISO-8859-1

17b10ca0a9a645f7473a239853fc1722?d=identicon&s=25 unknown (Guest)
on 2013-09-24 22:20
(Received via mailing list)
Hi,

IE8 (maybe also IE9/IE10) doesn't auto encode url (firefox do), and can
make utf8 requests
If you put "http://<nginx-server>/?test=ééé" in the address bar, the é
will not
be html encoded, and will be sent encoded in utf8 (c3a9 in hex, i've
checked with wireshark)

The problem is that the fastcgi backend (mono webapp, unix socket)
get the é in ISO-8859-1 (e9 in hex, i've checked with socat)

Is it normal that nginx (1.4.1) convert the request encoding from UTF8
to ISO-8859-1?

Is there a workaround (linux/nginx conf)? (haven't found any yet)

What the RFCs are saying? (HTTP request encoding, Fastcgi param
encoding)


I'm not using rewrite in nginx, i'm just passing the request to
a fastcgi unix socket.
(I will provide a minimal test conf/do more tests tomorrow)

To reproduce:
Request: curl -O "http://<nginx-server>/?test=ééé"
Request capture: sudo tcpdump -X 'port 80'
Backend capture: sudo socat -t100 -x -v
UNIX-LISTEN:/path/to/sock,mode=777,reuseaddr,fork
UNIX-CONNECT:/path/to/sock.original
http://superuser.com/questions/484671/can-i-monito...

Thanks in advance
Etienne
36a8284995fa0fb82e6aa2bede32adac?d=identicon&s=25 Francis Daly (Guest)
on 2013-09-24 23:48
(Received via mailing list)
On Tue, Sep 24, 2013 at 10:19:51PM +0200, etienne.champetier@free.fr
wrote:

Hi there,

> If you put "http://<nginx-server>/?test=" in the address bar, the  will not
> be html encoded, and will be sent encoded in utf8 (c3a9 in hex, i've checked
with wireshark)
>
> The problem is that the fastcgi backend (mono webapp, unix socket)
> get the  in ISO-8859-1 (e9 in hex, i've checked with socat)

When I use:

==
  server {
    listen 8080;
    location = / {
      fastcgi_param QUERY_STRING $query_string;
      fastcgi_pass 127.0.0.1:9;
    }
  }
==

and

  tcpdump -nn -i any -X -s 0 port 8080 or port 9

and

  curl http://localhost:8080/?key=

followed by some bytes, I don't see any difference in the bytes in
the to-8080 "GET /?key=" and the to-9 "QUERY_STRINGkey=" parts of the
tcpdump output.

What am I doing that is different to you?

  f
--
Francis Daly        francis@daoine.org
17b10ca0a9a645f7473a239853fc1722?d=identicon&s=25 unknown (Guest)
on 2013-09-25 13:32
(Received via mailing list)
Hi,

----- Mail original -----
> > If you put "http://<nginx-server>/?test=ééé" in the address bar,
>   server {
>   tcpdump -nn -i any -X -s 0 port 8080 or port 9
>
> and
>
>   curl http://localhost:8080/?key=
>
> followed by some bytes, I don't see any difference in the bytes in
> the to-8080 "GET /?key=" and the to-9 "QUERY_STRINGkey=" parts of the
> tcpdump output.
>
> What am I doing that is different to you?

Sorry today i'm not able to reproduce my 'bug'
Also not able to send utf8 url with IE
We (me and my collegue) must have misread the wireshark dump...
(http://en.wikipedia.org/wiki/User_error#PEBKAC)

With curl & IE i've tested nginx works perfectly (UTF8 in - UTF8 out /
Latin1 in - Latin1 out)

Thanks and sorry
Please log in before posting. Registration is free and takes only a minute.
Existing account

NEW: Do you have a Google/GoogleMail, Yahoo or Facebook account? No registration required!
Log in with Google account | Log in with Yahoo account | Log in with Facebook account
No account? Register here.