URI methods for application/x-www-form-urlencoded

e$B:G6a!“@.@%$5$s$,DI2C$7$?e(B URI.encode_www_form e$B$J$I!“e(B
application/x-www-form-urlencoded
e$B$r07$&%a%=%C%I$K$D$$$F!”$$$/$D$+%3%a%s%H$,$”$j$^$9!#e(B

  • URI.encode_www_component

    • form
      e$BMQ$G$“$k$3$H$,%a%=%C%IL>$+$i$o$+$i$J$$$N$G!”$h$/$J$$L>A0$@$H;W$$$^$9!#e(B
      form
      e$B$H$$$&8l$r4^$`$Y$-$G!"$?$H$($P!"e(BURI.encode_www_form_component
      e$B$Oe(B
      e$B$I$&$G$7$g$&$+!#e(B

    • “\x00” e$B$,e(B “%0” e$B$K$J$C$F$7$^$$$^$9!#e(B
      % bin/ruby -ruri -e ‘p URI.encode_www_component(“\x00”)’
      “%0”

    • e$B0z?t$re(B Encoding::ASCII_8BIT e$B$Ke(B force_encoding
      e$B$7$F=hM}$9$k$N$O@5$7$/$O$J$$e(B
      e$B5$$,$7$^$9!#e(B
      Shift_JIS e$BEy%^%k%A%P%$%HJ8;z$NCf$Ke(B ASCII
      e$B$KBP1~$9$k%P%$%H$,8=$l$k$H!“e(B
      e$B$=$3$,e(B ASCII e$B$G;D$C$F$7$^$&$3$H$,$”$j$^$9!#e(B

      % bin/ruby -ruri -e ‘p
      URI.encode_www_component(“\x83\x41”.force_encoding(“Shift_JIS”))’
      “%83A”

      HTML Standard
      e$B$K$h$l$P!“e(B
      e$BJ8;zC10L$G=hM}$9$k$N$G!“e(B”%83%41”
      e$B$K$J$k$Y$-$G$O$J$$$G$7$g$&$,!#e(B

      e$B$3$l$O<B:]$K$O:$$i$J$$5$$O$7$^$9$,!#e(B

    • e$B@[email protected]$Ne(B encoding e$B$O0z?t$Ne(B encoding
      e$B$K$J$j$^$9$,!“e(B
      e$B$^$:$$>l9g$b$”$k$N$G$O$J$$$G$7$g$&$+!#e(B
      e$B0z?t$,e(B UTF-16BE e$B$J$Ie(B
      ASCIIe$BHs8_49$N>l9g$K$O$"$+$i$5$^$K$^$:$/$F!"e(B
      e$B@8@.$7$?e(B % e$B$H$$$&J8;z$,J8;z$K$J$j$^$;$s!#e(B

      e$B$3$l$b<B:]$K$O:$$i$J$$5$$O$7$^$9!#e(B

  • URI.decode_www_component

    • form
      e$BMQ$G$“$k$3$H$,%a%=%C%IL>$+$i$o$+$i$J$$$N$G!”$h$/$J$$L>A0$@$H;W$$$^$9!#e(B
      form
      e$B$H$$$&8l$r4^$`$Y$-$G!"$?$H$($P!"e(BURI.decode_www_form_component
      e$B$Oe(B
      e$B$I$&$G$7$g$&$+!#e(B

    • URI.decode_www_component(“%20”) e$B$,6uJ8;zNs$K$J$C$F$7$^$$$^$9!#e(B
      % bin/ruby -ruri -e ‘p URI.decode_www_component(“%20”)’
      “”

    • e$BBhe(B2e$B0z?t$H$7$Fe(B encoding
      e$B$r;XDj2DG=$K$9$Y$-$G$O$J$$$G$7$g$&$+!#e(B
      application/x-www-form-urlencoded
      e$B$K$OJ8;z%(%s%3!<%G%#%s%0$N>pJs$,e(B
      e$B4^$^$l$F$$$J$$$?$a!"8=>u$Ne(B URI.decode_www_component e$B$Ge(B
      e$B@5$7$$%(%s%3!<%G%#%s%0$rIU2C$9$k$K$OJV$jCM$KBP$7$Fe(B
      force_encoding e$B$re(B
      e$B;H$&$3$H$K$J$j$^$9!#e(B
      e$B$7$+$7!"e(Bforce_encoding e$B$O4pK\E*$K;H$&$Y$-$G$J$$$o$1$G!"e(B
      URI.decode_www_component
      e$B<+?H$,0z?t$H$7$F%(%s%3!<%G%#%s%0$r<u$1<h$j!"e(B
      e$BFbIt$Ge(B force_encoding e$B$9$k$N$,NI$$$N$G$O$J$$$G$7$g$&$+!#e(B

      e$B$J$*!"%G%U%)%k%H$G$Oe(B ASCII-8BIT e$B$+e(B UTF-8
      e$B$K$9$Y$-$@$H;W$$$^$9!#e(B

  • URI.encode_www_form

  • URI.decode_www_form

    • e$B$H$&$<$sB8:_$9$k$HM=A[$7$?$N$G$9$,!“$J$$$h$&$G$9!#e(B
      URI.encode_www_form
      e$B$@$1$”$C$F$3$C$A$,$J$$$N$O$J$s$G$J$s$G$7$g$&$+e(B?

e$B@.@%$G$9!#e(B

(2010/03/16 15:48), Tanaka A. wrote:

e$B:G6a!“@.@%$5$s$,DI2C$7$?e(B URI.encode_www_form e$B$J$I!“e(B
application/x-www-form-urlencoded e$B$r07$&%a%=%C%I$K$D$$$F!”$$$/$D$+%3%a%s%H$,$”$j$^$9!#e(B

  • URI.encode_www_component

    • form e$BMQ$G$“$k$3$H$,%a%=%C%IL>$+$i$o$+$i$J$$$N$G!”$h$/$J$$L>A0$@$H;W$$$^$9!#e(B
      form e$B$H$$$&8l$r4^$`$Y$-$G!"$?$H$($P!"e(BURI.encode_www_form_component e$B$Oe(B
      e$B$I$&$G$7$g$&$+!#e(B

e$B$U$`!"JQ99$7$^$9!#e(B

  • “\x00” e$B$,e(B “%0” e$B$K$J$C$F$7$^$$$^$9!#e(B
    % bin/ruby -ruri -e ‘p URI.encode_www_component(“\x00”)’
    “%0”

e$B=$@5$7$^$9!#e(B

 e$BJ8;zC10L$G=hM}$9$k$N$G!"e(B"%83%41" e$B$K$J$k$Y$-$G$O$J$$$G$7$g$&$,!#e(B

 e$B$3$l$O<B:]$K$O:$$i$J$$5$$O$7$^$9$,!#e(B

e$B$*$C$H!“$&$C$+$j$7$F$$$^$7$?!”=$@5$7$^$9!#e(B

  • e$B@[email protected]$Ne(B encoding e$B$O0z?t$Ne(B encoding e$B$K$J$j$^$9$,!“e(B
    e$B$^$:$$>l9g$b$”$k$N$G$O$J$$$G$7$g$&$+!#e(B
    e$B0z?t$,e(B UTF-16BE e$B$J$Ie(B ASCIIe$BHs8_49$N>l9g$K$O$"$+$i$5$^$K$^$:$/$F!"e(B
    e$B@8@.$7$?e(B % e$B$H$$$&J8;z$,J8;z$K$J$j$^$;$s!#e(B

    e$B$3$l$b<B:]$K$O:$$i$J$$5$$O$7$^$9!#e(B

e$B$3$l$b=$@5$7$^$9!#e(B

  • URI.decode_www_component

    • form e$BMQ$G$“$k$3$H$,%a%=%C%IL>$+$i$o$+$i$J$$$N$G!”$h$/$J$$L>A0$@$H;W$$$^$9!#e(B
      form e$B$H$$$&8l$r4^$`$Y$-$G!"$?$H$($P!"e(BURI.decode_www_form_component e$B$Oe(B
      e$B$I$&$G$7$g$&$+!#e(B

e$BJQ99$7$^$9!#e(B

  • URI.decode_www_component(“%20”) e$B$,6uJ8;zNs$K$J$C$F$7$^$$$^$9!#e(B
    % bin/ruby -ruri -e ‘p URI.decode_www_component(“%20”)’
    “”

e$B=$@5$7$^$9!#e(B

  • e$BBhe(B2e$B0z?t$H$7$Fe(B encoding e$B$r;XDj2DG=$K$9$Y$-$G$O$J$$$G$7$g$&$+!#e(B
    application/x-www-form-urlencoded e$B$K$OJ8;z%(%s%3!<%G%#%s%0$N>pJs$,e(B
    e$B4^$^$l$F$$$J$$$?$a!"8=>u$Ne(B URI.decode_www_component e$B$Ge(B
    e$B@5$7$$%(%s%3!<%G%#%s%0$rIU2C$9$k$K$OJV$jCM$KBP$7$Fe(B force_encoding e$B$re(B
    e$B;H$&$3$H$K$J$j$^$9!#e(B
    e$B$7$+$7!"e(Bforce_encoding e$B$O4pK\E*$K;H$&$Y$-$G$J$$$o$1$G!"e(B
    URI.decode_www_component e$B<+?H$,0z?t$H$7$F%(%s%3!<%G%#%s%0$r<u$1<h$j!"e(B
    e$BFbIt$Ge(B force_encoding e$B$9$k$N$,NI$$$N$G$O$J$$$G$7$g$&$+!#e(B

    e$B$J$*!"%G%U%)%k%H$G$Oe(B ASCII-8BIT e$B$+e(B UTF-8 e$B$K$9$Y$-$@$H;W$$$^$9!#e(B

e$B$=$&$G$9$M!"%G%U%)%k%H$Oe(B UTF-8 e$B$K$7$^$7$g$&$+!#e(B

  • URI.encode_www_form

e$BBhFs0z?t$G%;%Q%l!<%?$r;XDj$5$;$k$3$H$O9M$($?$N$G$9$,!“e(B
e$B$H$j$”$($::G=i$O$J$7$K$7$F$*$/$3$H$K$7$^$9!#e(B

  • URI.decode_www_form

    • e$B$H$&$<$sB8:_$9$k$HM=A[$7$?$N$G$9$,!“$J$$$h$&$G$9!#e(B
      URI.encode_www_form e$B$@$1$”$C$F$3$C$A$,$J$$$N$O$J$s$G$J$s$G$7$g$&$+e(B?

e$BLa$jCM$GG:$s$G$$$?$+$i$@$C$?$N$G$9$,!“e(BArray
e$B$GDI2C$9$k$3$H$K$7$^$9!#e(B
rdoc e$B$G$J$<e(B Array
e$B$J$N$+$H!”$I$&07$&$HJXMx$+$O0FFb$9$k$h$&$K$7$^$9!#e(B

e$B@.@%$G$9!#e(B

(2010/03/21 0:36), Tanaka A. wrote:

  • ISO-2022-JP e$B$J$I$HF1$8$/!"$9$Y$F$N%P%$%H$re(B percent encoding
  • e$BNc30e(B

e$B$J$*!"e(BISO-2022-JP e$B$J$Ie(B Unicode e$B0J30$Ne(B ASCII incompatible encoding e$B$G$9$Y$F$Ne(B
e$B%P%$%H$re(B percent encoding e$B$K$9$k$H$$$&8=>u$NF0:n$OJQ$($J$/$F$$$$$H;W$$$^$9!#e(B

e$B:$$C$?$H$-$O<BAu$K$"$o$;$k$H8@$&$3$H$G!“e(BUTF-8
e$B$KJQ49$9$k$3$H$K$7$^$7$?!#e(B
e$B$^$?!”@h$N%a!<%k$G;XE&$5$l$?e(B Shift_JIS e$B$N%(%s%3!<%I$O!"e(B
“\x83\x41” e$B$Oe(B “%83A” e$B$H$J$k$h$&$KLa$7$^$7$?!#e(B

  • URI.decode_www_component
* e$BBhe(B2e$B0z?t$H$7$Fe(B encoding e$B$r;XDj2DG=$K$9$Y$-$G$O$J$$$G$7$g$&$+!#e(B

e$B$=$&$G$9$M!"%G%U%)%k%H$Oe(B UTF-8 e$B$K$7$^$7$g$&$+!#e(B

e$B%G%U%)%k%H$8$c$J$/$F6/@)$K$J$C$F$$$^$9!#e(B

e$B$*$C$H3N$+$K!"D>$7$^$7$?e(B

e$B$"$H!“e(BTBLDECWWWCOMP_[’+’] = ’ ’ if i == 0x20 e$B$,e(B
e$B%k!<%W$NCf$K$”$k$N$OL5BL$G$7$g$&!#e(B

e$BD>$7$^$7$?!#e(B

e$B@.@%$G$9!#e(B

(2010/03/21 21:39), NARUSE, Yui wrote:

   e$B$3$l$b<B:]$K$O:$$i$J$$5$$O$7$^$9!#e(B
  • e$B<+F0E*$Ke(B UTF-8 e$B$KJQ49e(B
  • ISO-2022-JP e$B$J$I$HF1$8$/!"$9$Y$F$N%P%$%H$re(B percent encoding
  • e$BNc30e(B

e$B$J$*!"e(BISO-2022-JP e$B$J$Ie(B Unicode e$B0J30$Ne(B ASCII incompatible encoding e$B$G$9$Y$F$Ne(B
e$B%P%$%H$re(B percent encoding e$B$K$9$k$H$$$&8=>u$NF0:n$OJQ$($J$/$F$$$$$H;W$$$^$9!#e(B

e$B:$$C$?$H$-$O<BAu$K$"$o$;$k$H8@$&$3$H$G!“e(BUTF-8 e$B$KJQ49$9$k$3$H$K$7$^$7$?!#e(B
e$B$^$?!”@h$N%a!<%k$G;XE&$5$l$?e(B Shift_JIS e$B$N%(%s%3!<%I$O!"e(B
“\x83\x41” e$B$Oe(B “%83A” e$B$H$J$k$h$&$KLa$7$^$7$?!#e(B

e$B$3$l$G$9$,!“:G?7$Ne(B HTML5
e$B$K;XE&$,H?1G$5$l!”%P%$%HC10L$G%(%s%3!<%I$9$k$h$&$K$J$j$^$7$?!#e(B
http://www.whatwg.org/specs/web-apps/current-work/multipage/association-of-controls-and-forms.html#url-encoded-form-data

2010e$BG/e(B3e$B7ne(B17e$BF|e(B4:45 NARUSE, Yui [email protected]:

  • URI.encode_www_component
  • e$B@[email protected]$Ne(B encoding e$B$O0z?t$Ne(B encoding e$B$K$J$j$^$9$,!“e(B
    e$B$^$:$$>l9g$b$”$k$N$G$O$J$$$G$7$g$&$+!#e(B
    e$B0z?t$,e(B UTF-16BE e$B$J$Ie(B ASCIIe$BHs8_49$N>l9g$K$O$"$+$i$5$^$K$^$:$/$F!"e(B
    e$B@8@.$7$?e(B % e$B$H$$$&J8;z$,J8;z$K$J$j$^$;$s!#e(B

    e$B$3$l$b<B:]$K$O:$$i$J$$5$$O$7$^$9!#e(B

e$B$3$l$b=$@5$7$^$9!#e(B

e$B9M$($?$N$G$9$,!"$3$l$O$&$^$/$J$$$h$&$G$9!#e(B

% ./ruby -ruri -e ’
v = URI.encode_www_form_component("ae$B$“e(B”.encode(“UTF-16BE”))
puts v.dump, v.encoding’
“\x00a%30%42”
US-ASCII

e$B$3$NNc$N7k2L$O!":G=i$Ne(B 2e$B%P%$%H$Oe(B UTF-16BE e$B$Je(B a
e$B$H$$$&J8;z$G!"e(B
e$B$=$l0J9_$Oe(B US-ASCII e$B$Je(B %30%42 e$B$H$$$&e(B
6e$BJ8;z$H$$$&$b$N$K$J$C$F$$$^$9!#e(B
e$B$3$l$O$R$H$D$NJ8;zNs$NCf$Ke(B UTF-16BE e$B$He(B US-ASCII
e$B$,:.$6$C$F$$$F!“e(B
e$B$”$+$i$5$^$KJQ$G$9!#e(B

HTML Standard e$B$Ne(B

  1. For each character in the entry’s name and value, apply the
    following
    subsubsteps:

    1. If the character isn’t in the range U+0020, U+002A, U+002D,
      U+002E,
      U+0030 to U+0039, U+0041 to U+005A, U+005F, U+0061 to U+007A then
      replace the character with a string formed as follows: Start with
      the empty string, and then, taking each byte of the character when
      expressed in the selected character encoding in turn, append to the
      string a U+0025 PERCENT SIGN character (%) followed by two
      characters in the ranges U+0030 DIGIT ZERO (0) to U+0039 DIGIT NINE
      (9) and U+0041 LATIN CAPITAL LETTER A to U+0046 LATIN CAPITAL
      LETTER F representing the hexadecimal value of the byte
      (zero-padded if necessary).

    2. If the character is a U+0020 SPACE character, replace it with a
      single U+002B PLUS SIGN character (+).

e$B$H$$$&ItJ,$rFI$`$H!"e(Bselected character encoding e$B$,2?$+!“e(B
e$B$H$$$&E@$,LdBj$K$J$j$^$9!#e(B
e$B$3$NJ8=q$N$b$&$A$g$C$H>e$NJ}$K$Oe(B selected character encoding
e$B$Oe(B
e$B$J$K$+e(B ASCII-compatible character encoding
e$B$rA*$V$H$$$&$3$H$,=q$$$F$”$j$^$9!#e(B

UTF-16BE e$B$O$"$+$i$5$^$Ke(B ASCII-compatible character encoding
e$B$G$O$J$$$N$G$9$,!“e(B
e$B$”$($F$=$NE@$OL5;k$7$FA*$s$G$7$^$C$?$H$-$K2?$,5/$-$k$+$H$$$&$H!"e(B
"ae$B$“e(B” e$B$H$$$&J8;zNs$N3FJ8;ze(B (“a”, "e$B$“e(B”)
e$B$=$l$>$l$K$D$$$F!"e(B
“a” e$B$Oe(B U+0061 e$B$J$N$G%j%9%H$KF~$C$F$$$k$?$a$=$N$^$^$K$7$F!"e(B
"e$B$“e(B” e$B$Oe(B U+3042 e$B$J$N$G%j%9%H$KF~$C$F$$$J$/$F!“e(B
selected character encoding (e$B:#2s$Oe(B UTF-16BE)
e$B$GI=8=$5$l$?J8;ze(B (0x30, 0x42)e$B$Ne(B
e$B3F%P%$%H$=$l$>$l$K$D$$$F!“e(B%HH e$B$N7A<0$K$9$k$N$G!“e(B”%30%42”
e$B$K$J$j!“e(B
e$BA4BN$r9g$o$;$k$He(B “a%30%42” e$B$H$$$&7k2L$K$J$j$^$9!#e(B
(e$B$=$7$F!”$3$NCf$Ke(B U+0020 e$B$OF~$C$F$$$J$$$N$Ge(B +
e$B$K$J$k$H$3$m$O$”$j$^$;$se(B)

e$B$=$7$F!"e(BRFC 3986 (URI) e$B$r9M$($k$H!"e(B
%30 e$B$de(B %42 e$B$KBP1~$9$ke(B ASCII e$BJ8;z$Oe(B unreserved
e$B$J$N$G!"e(B
e$B$=$l$i$Oe(B percent encoding
e$B$K$9$k$+$I$&$+$G0UL#$,JQ2=$7$J$$$3$H$K$J$C$F$$$^$9!#e(B
e$B$D$^$j!"e(B%30 e$B$Oe(B 0 e$B$HEy2A$G!“e(B%42 e$B$Oe(B B
e$B$HEy2A$G$9!#e(B
e$B$H$$$&$o$1$G!“e(B"a%30%42” e$B$H$$$&J8;zNs$r%G%3!<%I$9$k$He(B “a0B”
e$B$K$;$6$k$rF@$^$;$s!#e(B
e$B$D$^$j!”$b$H$b$H$Ne(B "ae$B$“e(B” e$B$H$$$&J8;zNs$,EA$o$j$^$;$s!#e(B

e$B$D$^$j!“e(BASCII-compatible character encoding
e$B$rA*$V$3$H$K$J$C$F$$$k$N$K$OM}M3$,$”$j$^$9!#e(B
e$BL5;k$9$k$N$O$h$m$7$/$"$j$^$;$s!#e(B

e$B;W$$Ib$+$VBP:v$H$7$F$O!"0J2<$/$i$$$G$7$g$&$+!#e(B

  • e$B<+F0E*$Ke(B UTF-8 e$B$KJQ49e(B
  • ISO-2022-JP e$B$J$I$HF1$8$/!"$9$Y$F$N%P%$%H$re(B percent encoding
  • e$BNc30e(B

e$B$J$*!"e(BISO-2022-JP e$B$J$Ie(B Unicode e$B0J30$Ne(B ASCII
incompatible encoding e$B$G$9$Y$F$Ne(B
e$B%P%$%H$re(B percent encoding
e$B$K$9$k$H$$$&8=>u$NF0:n$OJQ$($J$/$F$$$$$H;W$$$^$9!#e(B

  • URI.decode_www_component
  • e$BBhe(B2e$B0z?t$H$7$Fe(B encoding e$B$r;XDj2DG=$K$9$Y$-$G$O$J$$$G$7$g$&$+!#e(B

e$B$=$&$G$9$M!"%G%U%)%k%H$Oe(B UTF-8 e$B$K$7$^$7$g$&$+!#e(B

e$B%G%U%)%k%H$8$c$J$/$F6/@)$K$J$C$F$$$^$9!#e(B

% ./ruby -ruri -e ’
v = URI.decode_www_form_component(“%A1%A2”, “EUC-JP”)
p v
p v.encoding

“\xA1\xA2”
#Encoding:UTF-8

e$B$;$C$+$/e(B enc e$B0z?t$r<u$1IU$1$F$$$k$N$K!";H$C$F$$$^$;$s!#e(B

e$B$"$H!“e(BTBLDECWWWCOMP_[‘+’] = ’ ’ if i == 0x20 e$B$,e(B
e$B%k!<%W$NCf$K$”$k$N$OL5BL$G$7$g$&!#e(B