[Feature:1.9] pack format 'm' based on RFC 4648

e$B1sF#$G$9!#e(B

Python e$B$Ne(B base64.py e$B$Oe(B RFC 3548 e$B=5r$G$9$,!"e(BRuby e$B$Ne(B pack e$B$Ne(B m e$B$O8=:_e(B RFC 2045 e$B=5r$G!"$A$g$C$H8E$$$G$9!#e(B
e$B$=$3$Ge(B RFC 4648 e$B$K=`5r$9$k$N$O$I$&$G$7$g$&$+!#e(B

e$B;d$,FI$s$@46$8$G$O!"e(BRFC 2045 -> RFC 4648 e$B$NJQ99E@$O0J2<$Ne(B 3
e$B$D$N$h$&$G$9!#e(B

  1. e$B@^$jJV$5$J$$e(B (MUST)

  2. e$B%G%3!<%IBP>]30$NJ8;ze(B (e$B2~9T4^$`e(B)
    e$B$,$"$C$?$i5q@d$9$ke(B (MUST)

  3. URL e$B$d%U%!%$%kL>0BA4$J%P%j%"%s%H$,Dj5A$5$l$F$$$ke(B (+/
    e$B$G$J$/e(B -_ e$B$r;H$&e(B)
    (e$BDj5A$5$l$F$k$@$1$G%5%]!<%HI,?$G$O$J$$e(B)

e$B:#$^$G;d$,e(B pack e$B$Ne(B m e$B$r;H$&$H$-$Oe(B
[str].pack(“m”).gsub("\n", “”) e$B$,e(B
e$B%$%G%#%*%`$N$h$&$K$J$C$F$$$?$N$G!"e(B1)
e$B$O$H$F$bJXMx$@$H;W$$$^$9!#e(B

e$BI,?$G$O$J$$$G$9$,!"e(B3) e$B$bJXMx$=$&$G$9!#e(Bbase64.py
e$B$G$bDs6!$5$l$F$$$^$9!#e(B
e$B$I$&$$$&7A$GDs6!$9$k$+$OFq$7$$$H$3$m$G$9$,!#e(B

e$B%A%1%C%He(B #471 e$B$,99?7$5$l$^$7$?!#e(B (by Yuki S.)

base64.rbe$BL5$-:#!"2~9T$r5v$5$J$$7A$G$9$H%a!<%k$r=hM}$9$k%W%m%0%i%`$J$I$,M>$j$K$b2D0%A[$K$J$j$^$9!#e(B

e$B$h$C$F!“K\7o$N0U5A$OJ,$+$i$J$/$O$J$$$G$9$,!”$3$l$r<h$j9~$$J$i$P9b?e=%i%C%Q!<$H$7$F$Ne(Bbase64.rbe$B$N:FF3F~$,I,?$@$H$$$&7kO@$K$J$j$^$7$?!#e(B(ko1,
akr, yugui at akihabara)

http://redmine.ruby-lang.org/issues/show/471

e$BB`2qe(B

e$B1sF#$G$9!#e(B

2008/09/22 16:24 Yuki S. [email protected]:

Matze$B$r8r$($Fe(Bakre$B$5$s$H:{ED$5$s$H;d$GOC$7$^$7$?!#e(B

  • e$B8_49@-$rB;$J$C$F$^$Ge(Bpack formate$B$NJQ99$OC/$b9,$;$K$J$i$J$$!#5vMF$G$-$J$$e(B
  • Base64e$B%/%i%9$rI|3h$5$;$F!"?7$7$$e(BRFCe$B$K=`5r$9$k%a%=%C%I$rDI2C$9$k$J$i$PH?BP$7$J$$!#e(B
  • Base64.rfc4648e$B$H$$$&L>A0$O9s$$$N$G!“L>A0$O8!F$$NI,MW$,$”$k!#e(B

e$B$H$$$&$3$H$G!"1sF#$5$s$h$m$7$/$*4j$$$7$^$9!#e(B

e$B:#5$$,$D$$$?$s$G$9$,!“e(BRFC 3548 e$B=`5r$r$+$?$ke(B base64.py
e$B$O$”$m$&$3$H$+e(B
e$B2~9T$rL5;k$7$^$9$M!#%P%0$@$H;W$$$^$9$,!#e(B

e$B$A$J$_$Ke(B RFC 4648 e$B$K$O0J2<$N$h$&$J5-=R$,$"$j$^$9!#e(B

(12. Security Considerations)

If non-alphabet characters are ignored, instead of causing rejection
of the entire encoding (as recommended), a covert channel that can be
used to “leak” information is made possible. The ignored characters
could also be used for other nefarious purposes, such as to avoid a
string equality comparison or to trigger implementation bugs. The
implications of ignoring non-alphabet characters should be understood
in applications that do not follow the recommended practice.

e$B<B:]$Ke(B security issue
e$B$K$D$J$2$k$N$OFq$7$=$&$G$9$,!"8=>u$N;EMM$r;D$9e(B
e$B$J$i0l1~%I%-%e%a%s%H$K0l8@$r=q$$$F$*$/J}$,$$$$$+$b$7$l$^$;$s!#e(B

e$B$=$l$O$H$b$+$/!#e(B
e$B8E$$e(B RFC e$B$OAH$9~$$G:G?7$Ne(B RFC
e$B$O%i%$%V%i%j$H$$$&$N$OHa$7$$$N$G!"e(B
m0 e$B$N;~$@$1e(B RFC 4648 e$B=`5r$H$$$&$N$O$I$&$G$7$g$&$+!#e(B

e$B$D$^$je(B pack(“m0”) e$B$O2~9T$r=PNO$;$:!“e(Bunpack(“m0”)
e$B$O2~9T$,$”$C$?$ie(B
e$BNc30$rEj$2$k!"$H$$$&$h$&$K!#e(B

e$BG0$N$?$aIU$12C$($k$H!"8=>u$G$be(B m8 e$B$H$+$9$k$H=PNOI}$,e(B 8
e$B$K$J$j$^$9!#e(B

In article
[email protected],
“Yusuke ENDOH” [email protected] writes:

e$B$D$^$je(B pack(“m0”) e$B$O2~9T$r=PNO$;$:!“e(Bunpack(“m0”) e$B$O2~9T$,$”$C$?$ie(B
e$BNc30$rEj$2$k!"$H$$$&$h$&$K!#e(B

e$B$J$k$[$I!#$=$l$G$"$l$P!"e(Bbase64.rb e$B$rI|3h$5$;$J$/$F$b8_49@-$He(B
RFC 4648 e$B$N5sF0$N<B8=$NN>J}$r<B8=$G$-$=$&$G$9$M!#e(B

base64.rb e$B$,I,MW$@$H$$$&$N$O!"8_49@-$,$J$$$H;W$C$?$+$i$G$9$,!“e(B
e$B8_49@-$,J]$?$l$k$N$G$”$l$P$=$l$b$$$$$+$b$7$l$^$;$s!#e(B

e$B$?$@!“e(B[ruby-dev:35904] e$B$K=R$Y$i$l$F$$$k!”!Ve(BURL
e$B$d%U%!%$%kL>e(B
e$B0BA4$J%P%j%"%s%H!W$O$I$&$9$k$s$G$7$g$&$+!#e(B

e$B$=$C$A$bDs6!$9$k$3$H$r9M$($k$J$i!"e(Bpack/unpack e$B$G$N@0?t;XDj$@e(B
e$B$1$G6hJL$9$k$N$O$d$O$jFq$7$/$F!"e(Bbase64.rb e$B$,M_$7$/$J$C$?$j$7e(B
e$B$J$$$G$7$g$&$+!#e(B

e$B%A%1%C%He(B #471 e$B$,99?7$5$l$^$7$?!#e(B (by Yuki S.)

e$BC4Ev<Te(B Yukihiro M.e$B$+$ie(BYusuke E.e$B$KJQ99e(B

Matze$B$r8r$($Fe(Bakre$B$5$s$H:{ED$5$s$H;d$GOC$7$^$7$?!#e(B

  • e$B8_49@-$rB;$J$C$F$^$Ge(Bpack
    formate$B$NJQ99$OC/$b9,$;$K$J$i$J$$!#5vMF$G$-$J$$e(B

Base64e$B%/%i%9$rI|3h$5$;$F!"?7$7$$e(BRFCe$B$K=`5r$9$k%a%=%C%I$rDI2C$9$k$J$i$PH?BP$7$J$$!#e(B

  • Base64.rfc4648e$B$H$$$&L>A0$O9s$$$N$G!“L>A0$O8!F$$NI,MW$,$”$k!#e(B

e$B$H$$$&$3$H$G!"1sF#$5$s$h$m$7$/$*4j$$$7$^$9!#e(B

http://redmine.ruby-lang.org/issues/show/471

e$B1sF#$G$9!#e(B

2008/09/24 1:57 Tanaka A. [email protected]:

In article [email protected],
“Yusuke ENDOH” [email protected] writes:

e$B$D$^$je(B pack(“m0”) e$B$O2~9T$r=PNO$;$:!“e(Bunpack(“m0”) e$B$O2~9T$,$”$C$?$ie(B
e$BNc30$rEj$2$k!"$H$$$&$h$&$K!#e(B

e$B$J$k$[$I!#$=$l$G$"$l$P!"e(Bbase64.rb e$B$rI|3h$5$;$J$/$F$b8_49@-$He(B
RFC 4648 e$B$N5sF0$N<B8=$NN>J}$r<B8=$G$-$=$&$G$9$M!#e(B

e$B85$OCfED$5$s$N%“%$%G%”$@$C$?5$$,$7$^$9!#e(B

e$B$?$@!“e(B[ruby-dev:35904] e$B$K=R$Y$i$l$F$$$k!”!Ve(BURL e$B$d%U%!%$%kL>e(B
e$B0BA4$J%P%j%"%s%H!W$O$I$&$9$k$s$G$7$g$&$+!#e(B

e$B$=$C$A$bDs6!$9$k$3$H$r9M$($k$J$i!"e(Bpack/unpack e$B$G$N@0?t;XDj$@e(B
e$B$1$G6hJL$9$k$N$O$d$O$jFq$7$/$F!"e(Bbase64.rb e$B$,M_$7$/$J$C$?$j$7e(B
e$B$J$$$G$7$g$&$+!#e(B

e$B$&!<$s!#e(Bbase64.rb e$B$K$Oe(B decode_b e$B$H$+e(B b64encode
e$B$H$+2R!9$7$$e(B
e$B%a%=%C%I$,$"$k$N$G!“8D?ME*$K$O$”$^$jI|3h$5$;$?$/$J$$$H$3$me(B
e$B$G$9!#e(B

e$B$H$j$"$($:e(B m0 e$B$N<BAu$H!"e(Bbase64.rb
e$B$rI|3h$5$;$F0J2<$N%a%=%C%I$re(B
e$BDI2C$7$?%Q%C%A$r=q$$$F$_$^$7$?!#e(B

  • Base64.standard_encode64 : RFC 4648 e$B=`5re(B (e$B$N$O$:e(B)
    e$B$N%(%s%3!<%Ie(B
  • Base64.standard_decode64 : RFC 4648 e$B=`5re(B (e$B$N$O$:e(B)
    e$B$N%G%3!<%Ie(B
  • Base64.urlsafe_encode64 : URL e$B%;!<%U$J%P%j%"%s%H$N%(%s%3!<%Ie(B
  • Base64.urlsafe_decode64 : URL e$B%;!<%U$J%P%j%"%s%H$N%G%3!<%Ie(B

standard_ e$B$H$+$Ne(B prefix e$B$Oe(B base64.py
e$B$r;29M$K$7$^$7$?!#e(B
decode_b e$B$,e(B Kconv e$B$r;H$C$F$$$^$9$,!“e(BM17N
e$B$O$h$/$o$+$i$J$$$N$Ge(B
e$B$=$N$^$^$K$7$F$”$j$^$9!#e(B

Index: pack.c

— pack.c (revision 19494)
+++ pack.c (working copy)
@@ -362,7 +362,7 @@
#endif
static const char toofew[] = “too few arguments”;

-static void encodes(VALUE,const char*,long,int);
+static void encodes(VALUE,const char*,long,int,int);
static void qpencode(VALUE,VALUE,long);

static unsigned long utf8_to_uv(const char*,long*);
@@ -887,6 +887,11 @@
ptr = RSTRING_PTR(from);
plen = RSTRING_LEN(from);

  •  if (len == 0) {
    
  • encodes(res, ptr, plen, type, 0);
  • ptr += plen;
  • break;
  •  }
     if (len <= 2)
    
    len = 45;
    else
    @@ -898,7 +903,7 @@
    todo = len;
    else
    todo = plen;
  • encodes(res, ptr, todo, type);
  • encodes(res, ptr, todo, type, 1);
    plen -= todo;
    ptr += todo;
    }
    @@ -1007,7 +1012,7 @@
    “ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/”;

static void
-encodes(VALUE str, const char *s, long len, int type)
+encodes(VALUE str, const char *s, long len, int type, int tail_lf)
{
char buff[4096];
long i = 0;
@@ -1048,7 +1053,7 @@
buff[i++] = padding;
buff[i++] = padding;
}

  • buff[i++] = ‘\n’;
  • if (tail_lf) buff[i++] = ‘\n’;
    rb_str_buf_cat(str, buff, i);
    }

@@ -1793,7 +1798,7 @@
{
VALUE buf = infected_str_new(0, (send - s)*3/4, str);
char *ptr = RSTRING_PTR(buf);

  • int a = -1,b = -1,c = 0,d;
  • int a = -1,b = -1,c = 0,d = 0;
    static signed char b64_xtable[256];

    if (b64_xtable[‘/’] <= 0) {
    @@ -1806,32 +1811,64 @@
    b64_xtable[(unsigned char)b64_table[i]] = i;
    }
    }

  • while (s < send) {
  •    a = b = c = d = -1;
    
  •    while ((a = b64_xtable[(unsigned char)*s]) == -1 && s < send) 
    

{s++;}

  •    if (s >= send) break;
    
  •    s++;
    
  •    while ((b = b64_xtable[(unsigned char)*s]) == -1 && s < send) 
    

{s++;}

  •    if (s >= send) break;
    
  •    s++;
    
  •    while ((c = b64_xtable[(unsigned char)*s]) == -1 && s < send)
    

{if (*s == ‘=’) break; s++;}

  •    if (*s == '=' || s >= send) break;
    
  •    s++;
    
  •    while ((d = b64_xtable[(unsigned char)*s]) == -1 && s < send)
    

{if (*s == ‘=’) break; s++;}

  •    if (*s == '=' || s >= send) break;
    
  •    s++;
    
  •    *ptr++ = a << 2 | b >> 4;
    
  •    *ptr++ = b << 4 | c >> 2;
    
  •    *ptr++ = c << 6 | d;
    
  • }
  • if (a != -1 && b != -1) {
  •    if (c == -1 && *s == '=')
    
  • if (len == 0) {
  •    while (s < send) {
    
  •  a = b = c = d = -1;
    
  •  a = b64_xtable[(unsigned char)*s++];
    
  •  if (s >= send || a == -1) rb_raise(rb_eArgError, "invalid 
    

base64");

  •  b = b64_xtable[(unsigned char)*s++];
    
  •  if (s >= send || b == -1) rb_raise(rb_eArgError, "invalid 
    

base64");

  •  if (*s == '=') {
    
  •      if (s + 2 == send && *(s + 1) == '=') break;
    
  •      rb_raise(rb_eArgError, "invalid base64");
    
  •  }
    
  •  c = b64_xtable[(unsigned char)*s++];
    
  •  if (s >= send || c == -1) rb_raise(rb_eArgError, "invalid 
    

base64");

  •  if (s + 1 == send && *s == '=') break;
    
  •  d = b64_xtable[(unsigned char)*s++];
    
  •  if (d == -1) rb_raise(rb_eArgError, "invalid base64");
     *ptr++ = a << 2 | b >> 4;
    
  •    else if (c != -1 && *s == '=') {
    
  •  *ptr++ = b << 4 | c >> 2;
    
  •  *ptr++ = c << 6 | d;
    
  •    }
    
  •    if (c == -1) {
     *ptr++ = a << 2 | b >> 4;
    
  •  if (b & 0xf) rb_raise(rb_eArgError, "invalid base64");
    
  •    }
    
  •    else if (d == -1) {
    
  •  *ptr++ = a << 2 | b >> 4;
     *ptr++ = b << 4 | c >> 2;
    
  •  if (c & 0x3) rb_raise(rb_eArgError, "invalid base64");
       }
    
    }
  • else {
  •    while (s < send) {
    
  •  a = b = c = d = -1;
    
  •  while ((a = b64_xtable[(unsigned char)*s]) == -1 && s < send) 
    

{s++;}

  •  if (s >= send) break;
    
  •  s++;
    
  •  while ((b = b64_xtable[(unsigned char)*s]) == -1 && s < send) 
    

{s++;}

  •  if (s >= send) break;
    
  •  s++;
    
  •  while ((c = b64_xtable[(unsigned char)*s]) == -1 && s < send) {if
    

(*s == ‘=’) break; s++;}

  •  if (*s == '=' || s >= send) break;
    
  •  s++;
    
  •  while ((d = b64_xtable[(unsigned char)*s]) == -1 && s < send) {if
    

(*s == ‘=’) break; s++;}

  •  if (*s == '=' || s >= send) break;
    
  •  s++;
    
  •  *ptr++ = a << 2 | b >> 4;
    
  •  *ptr++ = b << 4 | c >> 2;
    
  •  *ptr++ = c << 6 | d;
    
  •    }
    
  •    if (a != -1 && b != -1) {
    
  •  if (c == -1 && *s == '=')
    
  •      *ptr++ = a << 2 | b >> 4;
    
  •  else if (c != -1 && *s == '=') {
    
  •      *ptr++ = a << 2 | b >> 4;
    
  •      *ptr++ = b << 4 | c >> 2;
    
  •  }
    
  •    }
    
  • }
    rb_str_set_len(buf, ptr - RSTRING_PTR(buf));
    UNPACK_PUSH(buf);
    }
    Index: lib/base64.rb
    ===================================================================
    — lib/base64.rb (revision 19466)
    +++ lib/base64.rb (working copy)
    @@ -40,6 +40,22 @@
    module Base64
    module_function

  • def standard_decode64(str)

  • str.unpack(“m0”)[0]

  • end

  • def standard_encode64(bin)

  • [bin].pack(“m0”)

  • end

  • def urlsafe_decode64(str)

  • standard_decode64(str.tr(“-_”, “+/”))

  • end

  • def urlsafe_encode64(str)

  • standard_encode64(bin).tr(“+/”, “-_”)

  • end

  • Returns the Base64-decoded version of +str+.

    require ‘base64’

Index: test/ruby/test_pack.rb

— test/ruby/test_pack.rb (revision 19494)
+++ test/ruby/test_pack.rb (working copy)
@@ -379,6 +379,36 @@
assert_equal([“\377\377\377”], “////\n”.unpack(“m”))
end

  • def test_pack_unpack_m0
  • assert_equal(“”, [“”].pack(“m0”))
  • assert_equal(“AA==”, [“\0”].pack(“m0”))
  • assert_equal(“AAA=”, [“\0\0”].pack(“m0”))
  • assert_equal(“AAAA”, [“\0\0\0”].pack(“m0”))
  • assert_equal(“/w==”, [“\377”].pack(“m0”))
  • assert_equal(“//8=”, [“\377\377”].pack(“m0”))
  • assert_equal(“////”, [“\377\377\377”].pack(“m0”))
  • assert_equal([“”], “”.unpack(“m0”))
  • assert_equal([“\0”], “AA==”.unpack(“m0”))
  • assert_equal([“\0\0”], “AAA=”.unpack(“m0”))
  • assert_equal([“\0\0\0”], “AAAA”.unpack(“m0”))
  • assert_equal([“\377”], “/w==”.unpack(“m0”))
  • assert_equal([“\377\377”], “//8=”.unpack(“m0”))
  • assert_equal([“\377\377\377”], “////”.unpack(“m0”))
  • assert_raise(ArgumentError) { “^”.unpack(“m0”) }
  • assert_raise(ArgumentError) { “A”.unpack(“m0”) }
  • assert_raise(ArgumentError) { “A^”.unpack(“m0”) }
  • assert_raise(ArgumentError) { “AA”.unpack(“m0”) }
  • assert_raise(ArgumentError) { “AA=”.unpack(“m0”) }
  • assert_raise(ArgumentError) { “AA===”.unpack(“m0”) }
  • assert_raise(ArgumentError) { “AA=x”.unpack(“m0”) }
  • assert_raise(ArgumentError) { “AAA”.unpack(“m0”) }
  • assert_raise(ArgumentError) { “AAA^”.unpack(“m0”) }
  • assert_raise(ArgumentError) { “AB==”.unpack(“m0”) }
  • assert_raise(ArgumentError) { “AAB=”.unpack(“m0”) }
  • end
  • def test_pack_unpack_M
    assert_equal(“a b c\td =\n\ne=\n”, [“a b c\td \ne”].pack(“M”))
    assert_equal([“a b c\td \ne”], “a b c\td =\n\ne=\n”.unpack(“M”))

In article
[email protected],
“Yusuke ENDOH” [email protected] writes:

e$B$&!<$s!#e(Bbase64.rb e$B$K$Oe(B decode_b e$B$H$+e(B b64encode e$B$H$+2R!9$7$$e(B
e$B%a%=%C%I$,$"$k$N$G!“8D?ME*$K$O$”$^$jI|3h$5$;$?$/$J$$$H$3$me(B
e$B$G$9!#e(B

e$B$=$N$X$s$O>C$7$A$c$C$F$$$$$H;W$$$^$9!#e(B

e$B@.@%$G$9!#e(B

Yusuke ENDOH wrote:

decode_b e$B$,e(B Kconv e$B$r;H$C$F$$$^$9$,!“e(BM17N e$B$O$h$/$o$+$i$J$$$N$Ge(B
e$B$=$N$^$^$K$7$F$”$j$^$9!#e(B

e$BF|K\8l0J30$KBP1~$5$;$k=$@5$,I,MW$G$7$g$&$M$’!#e(B

e$B$7$+$7!"$3$l!"$=$b$=$be(B BASE64 e$B$Ne(B decode e$B$8$c$J$/$F!"e(B
MIME e$B$Ne(B decode e$B$G$9$h$M!#e(B
e$B$3$3$K$$$k$Y$-%a%=%C%I$G$O$J$$5$$,$7$^$9!#e(B

mime.rb e$B$H$+$r?7@_$9$k$J$i$P$"$j$J$s$G$7$g$&$,!#e(B

In article
[email protected],
“Yusuke ENDOH” [email protected] writes:

e$B$=$N$X$s$He(B deprecated e$B$J%3!<%I$r>C$7$A$c$C$?$i$@$$$V$9$C$-$j$7$^$7$?!#e(B

pack/unpack e$B$N%I%-%e%a%s%H$K$O<j$r$D$1$F$J$$$s$G$9$M!#e(B

e$B1sF#$G$9!#e(B

2008/09/24 7:13 Tanaka A. [email protected]:

In article [email protected],
“Yusuke ENDOH” [email protected] writes:

e$B$&!<$s!#e(Bbase64.rb e$B$K$Oe(B decode_b e$B$H$+e(B b64encode e$B$H$+2R!9$7$$e(B
e$B%a%=%C%I$,$"$k$N$G!“8D?ME*$K$O$”$^$jI|3h$5$;$?$/$J$$$H$3$me(B
e$B$G$9!#e(B

e$B$=$N$X$s$O>C$7$A$c$C$F$$$$$H;W$$$^$9!#e(B

e$B$=$N$X$s$He(B deprecated
e$B$J%3!<%I$r>C$7$A$c$C$?$i$@$$$V$9$C$-$j$7$^$7$?!#e(B

module Base64 e$B$O0J2<$Ne(B 6 e$B$D$Ne(B module_function
e$B$r;}$A$^$9!#e(B

  • Base64.encode64 : RFC 2045 e$B=`5r$J%(%s%3!<%Ie(B
    (e$B2~9T$rF~$l$ke(B)
  • Base64.decode64 : RFC 2045 e$B=`5r$J%G%3!<%Ie(B
    (e$B2~9T$J$I$rL5;k$9$ke(B)
  • Base64.strict_encode64 : RFC 4648 e$B=`5r$J%(%s%3!<%Ie(B
    (e$B2~9T$rF~$l$J$$e(B)
  • Base64.strict_decode64 : RFC 4648 e$B=`5r$J%G%3!<%Ie(B (e$B2~9T$de(B
    = e$B$NITB-$ONc30e(B)
  • Base64.urlsafe_encode64 : URL e$B%;!<%U$J%P%j%"%s%H$N%(%s%3!<%Ie(B
  • Base64.urlsafe_decode64 : URL e$B%;!<%U$J%P%j%"%s%H$N%G%3!<%Ie(B

standard_encode64 e$B$Oe(B RFC
e$B$r0U<1$7$9$.$JL>A0$@$J$!$H;W$C$?$N$G!"e(B
e$B0lHL%f!<%6$K$o$+$j$d$9$=$&$Je(B strict e$B$K$7$F$_$^$7$?!#e(B
standard e$B$NJ}$,NI$1$l$PD>$7$^$9!#e(B

e$B$3$l$G$h$1$l$P%3%_%C%H$7$?$$$H;W$$$^$9$,!"$$$+$,$G$7$g$&$+!#e(B

Index: pack.c

— pack.c (revision 19526)
+++ pack.c (working copy)
@@ -362,7 +362,7 @@
#endif
static const char toofew[] = “too few arguments”;

-static void encodes(VALUE,const char*,long,int);
+static void encodes(VALUE,const char*,long,int,int);
static void qpencode(VALUE,VALUE,long);

static unsigned long utf8_to_uv(const char*,long*);
@@ -887,6 +887,11 @@
ptr = RSTRING_PTR(from);
plen = RSTRING_LEN(from);

  •  if (len == 0) {
    
  • encodes(res, ptr, plen, type, 0);
  • ptr += plen;
  • break;
  •  }
     if (len <= 2)
    
    len = 45;
    else
    @@ -898,7 +903,7 @@
    todo = len;
    else
    todo = plen;
  • encodes(res, ptr, todo, type);
  • encodes(res, ptr, todo, type, 1);
    plen -= todo;
    ptr += todo;
    }
    @@ -1007,7 +1012,7 @@
    “ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/”;

static void
-encodes(VALUE str, const char *s, long len, int type)
+encodes(VALUE str, const char *s, long len, int type, int tail_lf)
{
char buff[4096];
long i = 0;
@@ -1048,7 +1053,7 @@
buff[i++] = padding;
buff[i++] = padding;
}

  • buff[i++] = ‘\n’;
  • if (tail_lf) buff[i++] = ‘\n’;
    rb_str_buf_cat(str, buff, i);
    }

@@ -1793,7 +1798,7 @@
{
VALUE buf = infected_str_new(0, (send - s)*3/4, str);
char *ptr = RSTRING_PTR(buf);

  • int a = -1,b = -1,c = 0,d;
  • int a = -1,b = -1,c = 0,d = 0;
    static signed char b64_xtable[256];

    if (b64_xtable[‘/’] <= 0) {
    @@ -1806,32 +1811,64 @@
    b64_xtable[(unsigned char)b64_table[i]] = i;
    }
    }

  • while (s < send) {
  •    a = b = c = d = -1;
    
  •    while ((a = b64_xtable[(unsigned char)*s]) == -1 && s < send) 
    

{s++;}

  •    if (s >= send) break;
    
  •    s++;
    
  •    while ((b = b64_xtable[(unsigned char)*s]) == -1 && s < send) 
    

{s++;}

  •    if (s >= send) break;
    
  •    s++;
    
  •    while ((c = b64_xtable[(unsigned char)*s]) == -1 && s < send)
    

{if (*s == ‘=’) break; s++;}

  •    if (*s == '=' || s >= send) break;
    
  •    s++;
    
  •    while ((d = b64_xtable[(unsigned char)*s]) == -1 && s < send)
    

{if (*s == ‘=’) break; s++;}

  •    if (*s == '=' || s >= send) break;
    
  •    s++;
    
  •    *ptr++ = a << 2 | b >> 4;
    
  •    *ptr++ = b << 4 | c >> 2;
    
  •    *ptr++ = c << 6 | d;
    
  • }
  • if (a != -1 && b != -1) {
  •    if (c == -1 && *s == '=')
    
  • if (len == 0) {
  •    while (s < send) {
    
  •  a = b = c = d = -1;
    
  •  a = b64_xtable[(unsigned char)*s++];
    
  •  if (s >= send || a == -1) rb_raise(rb_eArgError, "invalid 
    

base64");

  •  b = b64_xtable[(unsigned char)*s++];
    
  •  if (s >= send || b == -1) rb_raise(rb_eArgError, "invalid 
    

base64");

  •  if (*s == '=') {
    
  •      if (s + 2 == send && *(s + 1) == '=') break;
    
  •      rb_raise(rb_eArgError, "invalid base64");
    
  •  }
    
  •  c = b64_xtable[(unsigned char)*s++];
    
  •  if (s >= send || c == -1) rb_raise(rb_eArgError, "invalid 
    

base64");

  •  if (s + 1 == send && *s == '=') break;
    
  •  d = b64_xtable[(unsigned char)*s++];
    
  •  if (d == -1) rb_raise(rb_eArgError, "invalid base64");
     *ptr++ = a << 2 | b >> 4;
    
  •    else if (c != -1 && *s == '=') {
    
  •  *ptr++ = b << 4 | c >> 2;
    
  •  *ptr++ = c << 6 | d;
    
  •    }
    
  •    if (c == -1) {
     *ptr++ = a << 2 | b >> 4;
    
  •  if (b & 0xf) rb_raise(rb_eArgError, "invalid base64");
    
  •    }
    
  •    else if (d == -1) {
    
  •  *ptr++ = a << 2 | b >> 4;
     *ptr++ = b << 4 | c >> 2;
    
  •  if (c & 0x3) rb_raise(rb_eArgError, "invalid base64");
       }
    
    }
  • else {
  •    while (s < send) {
    
  •  a = b = c = d = -1;
    
  •  while ((a = b64_xtable[(unsigned char)*s]) == -1 && s < send) 
    

{s++;}

  •  if (s >= send) break;
    
  •  s++;
    
  •  while ((b = b64_xtable[(unsigned char)*s]) == -1 && s < send) 
    

{s++;}

  •  if (s >= send) break;
    
  •  s++;
    
  •  while ((c = b64_xtable[(unsigned char)*s]) == -1 && s < send) {if
    

(*s == ‘=’) break; s++;}

  •  if (*s == '=' || s >= send) break;
    
  •  s++;
    
  •  while ((d = b64_xtable[(unsigned char)*s]) == -1 && s < send) {if
    

(*s == ‘=’) break; s++;}

  •  if (*s == '=' || s >= send) break;
    
  •  s++;
    
  •  *ptr++ = a << 2 | b >> 4;
    
  •  *ptr++ = b << 4 | c >> 2;
    
  •  *ptr++ = c << 6 | d;
    
  •    }
    
  •    if (a != -1 && b != -1) {
    
  •  if (c == -1 && *s == '=')
    
  •      *ptr++ = a << 2 | b >> 4;
    
  •  else if (c != -1 && *s == '=') {
    
  •      *ptr++ = a << 2 | b >> 4;
    
  •      *ptr++ = b << 4 | c >> 2;
    
  •  }
    
  •    }
    
  • }
    rb_str_set_len(buf, ptr - RSTRING_PTR(buf));
    UNPACK_PUSH(buf);
    }
    Index: lib/base64.rb
    ===================================================================
    — lib/base64.rb (revision 0)
    +++ lib/base64.rb (revision 0)
    @@ -0,0 +1,91 @@
    +#
    +# = base64.rb: methods for base64-encoding and -decoding stings
    +#

+# The Base64 module provides for the encoding (#encode64,
#strict_encode64,
+# #urlsafe_encode64) and decoding (#decode64, #strict_decode64,
+# #urlsafe_decode64) of binary data using a Base64 representation.
+#
+# == Example
+#
+# A simple encoding and decoding.
+#
+# require “base64”
+#
+# enc = Base64.encode64(‘Send reinforcements’)
+# # → “U2VuZCByZWluZm9yY2VtZW50cw==\n”
+# plain = Base64.decode64(enc)
+# # → “Send reinforcements”
+#
+# The purpose of using base64 to encode data is that it translates any
+# binary data into purely printable characters.
+
+module Base64

  • module_function
  • Returns the Base64-encoded version of +bin+.

  • This method complies with RFC 2045.

  • Line feeds are added to every 60 encoded charactors.

  • require ‘base64’

  • Base64.encode64("Now is the time for all good coders\nto learn

Ruby")

  • Generates:

  • Tm93IGlzIHRoZSB0aW1lIGZvciBhbGwgZ29vZCBjb2RlcnMKdG8gbGVhcm4g

  • UnVieQ==

  • def encode64(bin)
  • [bin].pack(“m”)
  • end
  • Returns the Base64-decoded version of +str+.

  • This method complies with RFC 2045.

  • Characters outside the base alphabet are ignored.

  • require ‘base64’

  • str = ‘VGhpcyBpcyBsaW5lIG9uZQpUaGlzIG’ +

  • ‘lzIGxpbmUgdHdvClRoaXMgaXMgbGlu’ +

  • ‘ZSB0aHJlZQpBbmQgc28gb24uLi4K’

  • puts Base64.decode64(str)

  • Generates:

  • This is line one

  • This is line two

  • This is line three

  • And so on…

  • def decode64(str)
  • str.unpack(“m”).first
  • end
  • Returns the Base64-encoded version of +bin+.

  • This method complies with RFC 4648.

  • No line feeds are added.

  • def strict_encode64(bin)
  • [bin].pack(“m0”)
  • end
  • Returns the Base64-decoded version of +str+.

  • This method complies with RFC 4648.

  • ArgumentError is raised if +str+ is incorrectly padded or contains

  • non-alphabet characters. Note that CR or LF are also rejected.

  • def strict_decode64(str)
  • str.unpack(“m0”).first
  • end
  • Returns the Base64-encoded version of +bin+.

  • This method complies with ``Base 64 Encoding with URL and Filename

Safe

  • Alphabet’’ in RFC 4648.

  • The alphabet uses ‘-’ instead of ‘+’ and ‘_’ instead of ‘/’.

  • def urlsafe_encode64(bin)
  • strict_encode64(bin).tr(“+/”, “-_”)
  • end
  • Returns the Base64-decoded version of +str+.

  • This method complies with ``Base 64 Encoding with URL and Filename

Safe

  • Alphabet’’ in RFC 4648.

  • The alphabet uses ‘-’ instead of ‘+’ and ‘_’ instead of ‘/’.

  • def urlsafe_decode64(str)

  • strict_decode64(str.tr(“-_”, “+/”))

  • end
    +end
    Index: test/ruby/test_pack.rb
    ===================================================================
    — test/ruby/test_pack.rb (revision 19526)
    +++ test/ruby/test_pack.rb (working copy)
    @@ -379,6 +379,36 @@
    assert_equal([“\377\377\377”], “////\n”.unpack(“m”))
    end

  • def test_pack_unpack_m0

  • assert_equal(“”, [“”].pack(“m0”))

  • assert_equal(“AA==”, [“\0”].pack(“m0”))

  • assert_equal(“AAA=”, [“\0\0”].pack(“m0”))

  • assert_equal(“AAAA”, [“\0\0\0”].pack(“m0”))

  • assert_equal(“/w==”, [“\377”].pack(“m0”))

  • assert_equal(“//8=”, [“\377\377”].pack(“m0”))

  • assert_equal(“////”, [“\377\377\377”].pack(“m0”))

  • assert_equal([“”], “”.unpack(“m0”))

  • assert_equal([“\0”], “AA==”.unpack(“m0”))

  • assert_equal([“\0\0”], “AAA=”.unpack(“m0”))

  • assert_equal([“\0\0\0”], “AAAA”.unpack(“m0”))

  • assert_equal([“\377”], “/w==”.unpack(“m0”))

  • assert_equal([“\377\377”], “//8=”.unpack(“m0”))

  • assert_equal([“\377\377\377”], “////”.unpack(“m0”))

  • assert_raise(ArgumentError) { “^”.unpack(“m0”) }

  • assert_raise(ArgumentError) { “A”.unpack(“m0”) }

  • assert_raise(ArgumentError) { “A^”.unpack(“m0”) }

  • assert_raise(ArgumentError) { “AA”.unpack(“m0”) }

  • assert_raise(ArgumentError) { “AA=”.unpack(“m0”) }

  • assert_raise(ArgumentError) { “AA===”.unpack(“m0”) }

  • assert_raise(ArgumentError) { “AA=x”.unpack(“m0”) }

  • assert_raise(ArgumentError) { “AAA”.unpack(“m0”) }

  • assert_raise(ArgumentError) { “AAA^”.unpack(“m0”) }

  • assert_raise(ArgumentError) { “AB==”.unpack(“m0”) }

  • assert_raise(ArgumentError) { “AAB=”.unpack(“m0”) }

  • end

  • def test_pack_unpack_M
    assert_equal(“a b c\td =\n\ne=\n”, [“a b c\td \ne”].pack(“M”))
    assert_equal([“a b c\td \ne”], “a b c\td =\n\ne=\n”.unpack(“M”))
    Index: test/base64/test_base64.rb
    ===================================================================
    — test/base64/test_base64.rb (revision 0)
    +++ test/base64/test_base64.rb (revision 0)
    @@ -0,0 +1,99 @@
    +require “test/unit”
    +require “base64”

+class TestBase64 < Test::Unit::TestCase

  • def test_sample
  • assert_equal(“U2VuZCByZWluZm9yY2VtZW50cw==\n”,
    Base64.encode64(‘Send reinforcements’))
  • assert_equal(‘Send reinforcements’,
    Base64.decode64(“U2VuZCByZWluZm9yY2VtZW50cw==\n”))
  • assert_equal(

“Tm93IGlzIHRoZSB0aW1lIGZvciBhbGwgZ29vZCBjb2RlcnMKdG8gbGVhcm4g\nUnVieQ==\n”,

  •  Base64.encode64("Now is the time for all good coders\nto learn 
    

Ruby"))

  • assert_equal(
  •  "Now is the time for all good coders\nto learn Ruby",
    

Base64.decode64(“Tm93IGlzIHRoZSB0aW1lIGZvciBhbGwgZ29vZCBjb2RlcnMKdG8gbGVhcm4g\nUnVieQ==\n”))

  • assert_equal(

“VGhpcyBpcyBsaW5lIG9uZQpUaGlzIGlzIGxpbmUgdHdvClRoaXMgaXMgbGlu\nZSB0aHJlZQpBbmQgc28gb24uLi4K\n”,

  •  Base64.encode64("This is line one\nThis is line two\nThis is
    

line three\nAnd so on…\n"))

  • assert_equal(
  •  "This is line one\nThis is line two\nThis is line three\nAnd so 
    

on…\n",
+
Base64.decode64(“VGhpcyBpcyBsaW5lIG9uZQpUaGlzIGlzIGxpbmUgdHdvClRoaXMgaXMgbGluZSB0aHJlZQpBbmQgc28gb24uLi4K”))

  • end
  • def test_encode64
  • assert_equal(“”, Base64.encode64(“”))
  • assert_equal(“AA==\n”, Base64.encode64(“\0”))
  • assert_equal(“AAA=\n”, Base64.encode64(“\0\0”))
  • assert_equal(“AAAA\n”, Base64.encode64(“\0\0\0”))
  • assert_equal(“/w==\n”, Base64.encode64(“\377”))
  • assert_equal(“//8=\n”, Base64.encode64(“\377\377”))
  • assert_equal(“////\n”, Base64.encode64(“\377\377\377”))
  • assert_equal(“/+8=\n”, Base64.encode64(“\xff\xef”))
  • end
  • def test_decode64
  • assert_equal(“”, Base64.decode64(“”))
  • assert_equal(“\0”, Base64.decode64(“AA==\n”))
  • assert_equal(“\0\0”, Base64.decode64(“AAA=\n”))
  • assert_equal(“\0\0\0”, Base64.decode64(“AAAA\n”))
  • assert_equal(“\377”, Base64.decode64(“/w==\n”))
  • assert_equal(“\377\377”, Base64.decode64(“//8=\n”))
  • assert_equal(“\377\377\377”, Base64.decode64(“////\n”))
  • assert_equal(“\xff\xef”, Base64.decode64(“/+8=\n”))
  • end
  • def test_strict_encode64
  • assert_equal(“”, Base64.strict_encode64(“”))
  • assert_equal(“AA==”, Base64.strict_encode64(“\0”))
  • assert_equal(“AAA=”, Base64.strict_encode64(“\0\0”))
  • assert_equal(“AAAA”, Base64.strict_encode64(“\0\0\0”))
  • assert_equal(“/w==”, Base64.strict_encode64(“\377”))
  • assert_equal(“//8=”, Base64.strict_encode64(“\377\377”))
  • assert_equal(“////”, Base64.strict_encode64(“\377\377\377”))
  • assert_equal(“/+8=”, Base64.strict_encode64(“\xff\xef”))
  • end
  • def test_strict_decode64
  • assert_equal(“”, Base64.strict_decode64(“”))
  • assert_equal(“\0”, Base64.strict_decode64(“AA==”))
  • assert_equal(“\0\0”, Base64.strict_decode64(“AAA=”))
  • assert_equal(“\0\0\0”, Base64.strict_decode64(“AAAA”))
  • assert_equal(“\377”, Base64.strict_decode64(“/w==”))
  • assert_equal(“\377\377”, Base64.strict_decode64(“//8=”))
  • assert_equal(“\377\377\377”, Base64.strict_decode64(“////”))
  • assert_equal(“\xff\xef”, Base64.strict_decode64(“/+8=”))
  • assert_raise(ArgumentError) { Base64.strict_decode64(“^”) }
  • assert_raise(ArgumentError) { Base64.strict_decode64(“A”) }
  • assert_raise(ArgumentError) { Base64.strict_decode64(“A^”) }
  • assert_raise(ArgumentError) { Base64.strict_decode64(“AA”) }
  • assert_raise(ArgumentError) { Base64.strict_decode64(“AA=”) }
  • assert_raise(ArgumentError) { Base64.strict_decode64(“AA===”) }
  • assert_raise(ArgumentError) { Base64.strict_decode64(“AA=x”) }
  • assert_raise(ArgumentError) { Base64.strict_decode64(“AAA”) }
  • assert_raise(ArgumentError) { Base64.strict_decode64(“AAA^”) }
  • assert_raise(ArgumentError) { Base64.strict_decode64(“AB==”) }
  • assert_raise(ArgumentError) { Base64.strict_decode64(“AAB=”) }
  • end
  • def test_urlsafe_encode64
  • assert_equal(“”, Base64.urlsafe_encode64(“”))
  • assert_equal(“AA==”, Base64.urlsafe_encode64(“\0”))
  • assert_equal(“AAA=”, Base64.urlsafe_encode64(“\0\0”))
  • assert_equal(“AAAA”, Base64.urlsafe_encode64(“\0\0\0”))
  • assert_equal(“_w==”, Base64.urlsafe_encode64(“\377”))
  • assert_equal(“__8=”, Base64.urlsafe_encode64(“\377\377”))
  • assert_equal(“____”, Base64.urlsafe_encode64(“\377\377\377”))
  • assert_equal(“_-8=”, Base64.urlsafe_encode64(“\xff\xef”))
  • end
  • def test_urlsafe_decode64
  • assert_equal(“”, Base64.urlsafe_decode64(“”))
  • assert_equal(“\0”, Base64.urlsafe_decode64(“AA==”))
  • assert_equal(“\0\0”, Base64.urlsafe_decode64(“AAA=”))
  • assert_equal(“\0\0\0”, Base64.urlsafe_decode64(“AAAA”))
  • assert_equal(“\377”, Base64.urlsafe_decode64(“_w==”))
  • assert_equal(“\377\377”, Base64.urlsafe_decode64(“__8=”))
  • assert_equal(“\377\377\377”, Base64.urlsafe_decode64(“____”))
  • assert_equal(“\xff\xef”, Base64.urlsafe_decode64(“_+8=”))
  • end
    +end

e$B1sF#$G$9!#e(B

2008/09/25 0:40 Tanaka A. [email protected]:

In article [email protected],
“Yusuke ENDOH” [email protected] writes:

e$B$=$N$X$s$He(B deprecated e$B$J%3!<%I$r>C$7$A$c$C$?$i$@$$$V$9$C$-$j$7$^$7$?!#e(B

pack/unpack e$B$N%I%-%e%a%s%H$K$O<j$r$D$1$F$J$$$s$G$9$M!#e(B

e$BK:$l$F$^$7$?!#B>$N%I%-%e%a%s%H$K$"$o$;$FC;$/=q$/$N$,Fq$7$$$G$9$,!"e(B
e$B$3$s$J46$8$G$I$&$G$7$g$&!#e(B

@@ -414,7 +414,8 @@

  •   L     |  Unsigned long
    
  •   l     |  Long
    
  •   M     |  Quoted printable, MIME encoding (see RFC2045)
    
    •   m     |  Base64 encoded string
      
    •   m     |  Base64 encoded string (see RFC 2045, count is width)
      
    •         |  (no line feed are added if count is 0, see RFC 4648)
      
    •   N     |  Long, network (big-endian) byte order
      
    •   n     |  Short, network (big-endian) byte-order
      
    •   P     |  Pointer to a structure (fixed-length string)
      

@@ -1242,7 +1248,8 @@

  • -------+---------+-----------------------------------------
    
  •   M    | String  | quoted-printable
    
  • -------+---------+-----------------------------------------
    
    •   m    | String  | base64-encoded
      
    •   m    | String  | base64-encoded (RFC 2045) (default)
      
    •        |         | base64-encoded (RFC 4648) if followed by 0
      
    • -------+---------+-----------------------------------------
      
    •   N    | Integer | treat four characters as an unsigned
      
    •        |         | long in network byte order
      

e$B$^$D$b$He(B e$B$f$-$R$m$G$9e(B

In message “Re: [ruby-dev:36525] Re: [Feature #471] pack format ‘m’
based on RFC 4648”
on Thu, 25 Sep 2008 01:14:00 +0900, “Yusuke ENDOH” [email protected]
writes:

|2008/09/25 0:40 Tanaka A. [email protected]:
|> In article [email protected],
|> “Yusuke ENDOH” [email protected] writes:
|>
|>> e$B$=$N$X$s$He(B deprecated e$B$J%3!<%I$r>C$7$A$c$C$?$i$@$$$V$9$C$-$j$7$^$7$?!#e(B
|>
|> pack/unpack e$B$N%I%-%e%a%s%H$K$O<j$r$D$1$F$J$$$s$G$9$M!#e(B
|
|e$BK:$l$F$^$7$?!#B>$N%I%-%e%a%s%H$K$"$o$;$FC;$/=q$/$N$,Fq$7$$$G$9$,!"e(B
|e$B$3$s$J46$8$G$I$&$G$7$g$&!#e(B

e$B%3%_%C%H$7$F$/$@$5$$!#?7e(Bbase64.rbe$B$b4^$a$F!#e(B

e$B1sF#$G$9!#e(B

2008/09/25 1:19 Yukihiro M. [email protected]:

|
|e$BK:$l$F$^$7$?!#B>$N%I%-%e%a%s%H$K$"$o$;$FC;$/=q$/$N$,Fq$7$$$G$9$,!"e(B
|e$B$3$s$J46$8$G$I$&$G$7$g$&!#e(B

e$B%3%_%C%H$7$F$/$@$5$$!#?7e(Bbase64.rbe$B$b4^$a$F!#e(B

e$B$"$j$,$H$&$4$6$$$^$9!#%3%_%C%H$7$^$7$?!#e(B