[BUG:trunk] r20625 dumps core and many strings associated with wrong encoding

Yuguie$B$G$9!#e(B

r20625e$B$G!"e(Bdefault_externale$B$be(Bdefault_internale$BF1MM$KL$@_Dj;~$Ke(B
(rb_encoding_t*)NULL e$B$rJV$9$h$&$K$J$C$?7k2L!“e(Btrunke$B$Ge(Bruby -e
1e$B$H$9$k$H%3e(B
e$B%”$rEG$/$h$&$K$J$j$^$7$?!#e(B

e$B$H$j$"$($:!"D>$9$N$O$7$F$_$^$7$?!#7k2L$H$7$F!"e(Bprocess_optionse$BCf$K4v$D$+e(B
default_externale$B$K0MB8$9$k%*%V%8%’%/%H$r@8@.$7$F$$$k$3$H$,J,$+$j$^$7$?!#e(B

  • ruby_process_options()e$BFb!#e(Bruby_script()e$B7PM3e(Brb_progname:
    e$B$I$&$;$"$H$G:F@_Dj$7$F$$$k$N$G!"e(BASCII-8BITe$B$G2>9=C[e(B

  • process_options()e$BFb!#e(Bruby_script()e$B7PM3e(Brb_progname:
    e$BD>8e$Ke(Blocale
    encodinge$B$re(Bassociatee$B$7$F$$$k$N$G!"9=C[;~$Oe(BASCII-8BIT

  • rb_init_load_path_safee$B$,@_Dj$9$ke(BLOAD_PATH

  • -Ie$B$G@Dj$5$l$ke(BLOAD_PATH
    e$B0l1~!"e(Bdefault_externale$B$r0U?^$7$F$$$k$i$7$$$N$G!"e(Bdefault_externale$B$N7hDje(B
    e$B8e$Ke(Bassociatee$B$9$k$h$&$K$7$F$
    $^$7$?!#e(B

  • gem_prelude.rb:291e$B$G3+$$$?e(BIO
    e$BFC$K2?$b$7$F$$$J$$$b$N$N$"$^$jLdBj$O$J$5$=$&e(B?

  • e$B$3$l$K$h$j@_Dj$5$l$ke(BLOAD_PATH
    e$B0l=o$Ke(Bdefault_externale$B7hDj8e$K$=$l$re(Bassociatee$B$7$F$7$^$$$^$7$?$,!"NI$$$Ne(B
    e$B$+$Je(B?

Yugui (Yuki S.) さんは書きました:

とりあえず、直すのはしてみました。結果として、process_options中に幾つか
default_externalに依存するオブジェクトを生成していることが分かりました。

パッチを添付し忘れました。

修正方針ってこれで良いものでしょうか?

  • LOAD_PATHのエンコーディングはどうするべき?
    -E を考慮したdefault_externalか、それともlocale encodingを強制するか

  • prelude内でのIOはどれほど許される?

  • なぜこんなに早いタイミングでpreludeが実行されてるんでしょうか?

  • rb_prognameã‚’rb_gc_register_mark_objectしていましたが、rb_prognameはあ
    とから上書きされているのでこちらもrb_gc_register_mark_objectしないとまずい?

e$B$^$D$b$He(B e$B$f$-$R$m$G$9e(B

In message “Re: [ruby-dev:37390] [BUG:trunk] r20625 dumps core and many
strings associated with wrong encoding”
on Thu, 11 Dec 2008 23:10:52 +0900, “Yugui (Yuki S.)”
[email protected] writes:

|r20625e$B$G!"e(Bdefault_externale$B$be(Bdefault_internale$BF1MM$KL$@_Dj;~$Ke(B
|(rb_encoding_t*)NULL e$B$rJV$9$h$&$K$J$C$?7k2L!“e(Btrunke$B$Ge(Bruby -e 1e$B$H$9$k$H%3e(B
|e$B%”$rEG$/$h$&$K$J$j$^$7$?!#e(B

e$B$“$i$i!#e(Bdefault_externale$B$K$D$$$F$OL$@_Dj;~$Oe(Blocale_encoding
e$B$rJV$;$P$$$$$s$8$c$J$$$G$9$+$M!#$C$F!”$=$&$$$&LdBj$8$c$J$$!)e(B

Yuguie$B$G$9!#e(B

Yukihiro M. e$B$5$s$O=q$-$^$7$?e(B:

e$B$^$D$b$He(B e$B$f$-$R$m$G$9e(B
e$B$"$i$i!#e(Bdefault_externale$B$K$D$$$F$OL$@_Dj;~$Oe(Blocale_encoding
e$B$rJV$;$P$$$$$s$8$c$J$$$G$9$+$M!#$C$F!"$=$&$$$&LdBj$8$c$J$$!)e(B

e$B2?$i$+$N%(%s%3!<%G%#%s%0$rJV$5$6$k$rF@$^$;$s$,!";d$N%Q%C%A$Ne(BASCII-8BITe$B7he(B
e$B$aBG$A$h$j$Oe(Blocale
encodinge$B$rJV$7$F$*$$$?$[$&$,NI$5$=$&$G$9$M!#e(B

e$B$^$D$b$He(B e$B$f$-$R$m$G$9e(B

In message “Re: [ruby-dev:37391] Re: [BUG:trunk] r20625 dumps core and
many strings associated with wrong encoding”
on Thu, 11 Dec 2008 23:18:43 +0900, “Yugui (Yuki S.)”
[email protected] writes:

|e$B=$@5J}?K$C$F$3$l$GNI$$$b$N$G$7$g$&$+e(B?
|* LOAD_PATHe$B$N%(%s%3!<%G%#%s%0$O$I$&$9$k$Y$-e(B?
| -E e$B$r9MN8$7$?e(Bdefault_externale$B$+!"$=$l$H$be(Blocale encodinge$B$r6/@)$9$k$+e(B

-Ee$B$r9MN8$9$Y$-$G$9$,!"%*%W%7%g%s$N=hM}$r$5$+$N$$kI,MW$O$J$$e(B
e$B$N$G$O$J$$$+$H;W$$$^$9!#$D$^$j!"e(BPATHe$B$NCf?H$He(B-Ee$B$h$jA0$K;XDj$5e(B
e$B$l$?e(B-Ie$B$K$D$$$F$Oe(Blocalee$B$H8+$J$9$H$+!#e(B

|* preludee$BFb$G$Ne(BIOe$B$O$I$l$[$I5v$5$l$ke(B?

e$B5v$5$l$k$s$8$c$J$$$+$H!#$"$^$je(Bpreludee$B$,=E$$$N$O?d>)$7$,$?$$e(B
e$B$N$G$9$,e(B(e$B$@$+$iK\Ev$Oe(Bgem_preludee$B$O$+$J$j%$%de(B)e$B!#e(B

|* e$B$J$<$3$s$J$KAa$$%?%$%_%s%0$Ge(Bpreludee$B$,<B9T$5$l$F$k$s$G$7$g$&$+e(B?

e$B$3$l$O$h$/$o$+$j$^$;$s!#%W%m%0%i%`A0$,K>$^$7$$5$$,$7$^e(B
e$B$9$,!#e(B

|* rb_prognamee$B$re(Brb_gc_register_mark_objecte$B$7$F$$$^$7$?$,!"e(Brb_prognamee$B$O$"e(B
|e$B$H$+$i>e=q$-$5$l$F$$$k$N$G$3$A$i$be(Brb_gc_register_mark_objecte$B$7$J$$$H$^$:$$e(B?

rb_gc_register_mark_object()e$B$5$l$F$$$k$N$Oe(Brb_argv0e$B$G$9$h$M!#e(B
rb_prognamee$B<+?H$Oe(BVMe$B$+$i;2>H$5$l$F$$$k$N$Ge(Bmarke$B$OITMW$G$O$J$$e(B
e$B$+$H!#e(B

                            e$B$^$D$b$He(B e$B$f$-$R$me(B /:|)

Yukihiro M. e$B$5$s$O=q$-$^$7$?e(B:

e$B$^$D$b$He(B e$B$f$-$R$m$G$9e(B
|e$B=$@5J}?K$C$F$3$l$GNI$$$b$N$G$7$g$&$+e(B?
|* LOAD_PATHe$B$N%(%s%3!<%G%#%s%0$O$I$&$9$k$Y$-e(B?
| -E e$B$r9MN8$7$?e(Bdefault_externale$B$+!"$=$l$H$be(Blocale encodinge$B$r6/@)$9$k$+e(B

-Ee$B$r9MN8$9$Y$-$G$9$,!"%*%W%7%g%s$N=hM}$r$5$+$N$$kI,MW$O$J$$e(B
e$B$N$G$O$J$$$+$H;W$$$^$9!#$D$^$j!"e(BPATHe$B$NCf?H$He(B-Ee$B$h$jA0$K;XDj$5e(B
e$B$l$?e(B-Ie$B$K$D$$$F$Oe(Blocalee$B$H8+$J$9$H$+!#e(B

e$B$"!<!"e(Brb_file_systemencodinge$B$NABr;h$b$"$j$^$9$h$M!#%Q%9L>$N%(%s%3!<%G%#e(B
e$B%s%0$r$I$&;}$D$+$O?’!9G:$^$7$+$C$?$H;W$$$^$9!#$I$&$7$?$b$s$G$7$g$&!#e(B
e$B$H$j$"$($:e(Blocalee$B$K$7$F$
$-$^$9$M!#e(B(r20656)

before:
ANG=ja_JP.UTF-8 ./ruby -Ecp932 -I tmp -e ‘p $:.map{|x| [x, x.encoding]};
require “date”; p $LOADED_FEATURES.map{|x| [x, x.encoding]}’
[["/Users/yugui/src/ruby/mri/build/O0/tmp", #Encoding:UTF-8],
["/Users/yugui/varyrubies/trunk-O0/lib/ruby/gems/1.9.1/gems/evil-ruby-0.1.0/lib",
#Encoding:US-ASCII],
["/Users/yugui/varyrubies/trunk-O0/lib/ruby/site_ruby/1.9.1",
#Encoding:ASCII-8BIT],
["/Users/yugui/varyrubies/trunk-O0/lib/ruby/site_ruby/1.9.1/i386-darwin9.5.0",
#Encoding:ASCII-8BIT],
["/Users/yugui/varyrubies/trunk-O0/lib/ruby/site_ruby",
#Encoding:ASCII-8BIT],
["/Users/yugui/varyrubies/trunk-O0/lib/ruby/vendor_ruby/1.9.1",
#Encoding:ASCII-8BIT],
["/Users/yugui/varyrubies/trunk-O0/lib/ruby/vendor_ruby/1.9.1/i386-darwin9.5.0",
#Encoding:ASCII-8BIT],
["/Users/yugui/varyrubies/trunk-O0/lib/ruby/vendor_ruby",
#Encoding:ASCII-8BIT],
["/Users/yugui/varyrubies/trunk-O0/lib/ruby/1.9.1",
#Encoding:ASCII-8BIT],
["/Users/yugui/varyrubies/trunk-O0/lib/ruby/1.9.1/i386-darwin9.5.0",
#Encoding:ASCII-8BIT], [".", #Encoding:ASCII-8BIT]]

[[“enumerator.so”, #Encoding:ASCII-8BIT],
["/Users/yugui/varyrubies/trunk-O0/lib/ruby/1.9.1/i386-darwin9.5.0/enc/encdb.bundle",
#Encoding:US-ASCII],
["/Users/yugui/varyrubies/trunk-O0/lib/ruby/1.9.1/i386-darwin9.5.0/enc/trans/transdb.bundle",
#Encoding:US-ASCII],
["/Users/yugui/varyrubies/trunk-O0/lib/ruby/1.9.1/rubygems.rb",
#Encoding:US-ASCII],
["/Users/yugui/varyrubies/trunk-O0/lib/ruby/1.9.1/i386-darwin9.5.0/enc/shift_jis.bundle",
#Encoding:US-ASCII],
["/Users/yugui/varyrubies/trunk-O0/lib/ruby/1.9.1/date/format.rb",
#Encoding:US-ASCII],
["/Users/yugui/varyrubies/trunk-O0/lib/ruby/1.9.1/date.rb",
#Encoding:US-ASCII]]

after:
[["/Users/yugui/src/ruby/mri/build/O0/tmp", #Encoding:UTF-8],
["/Users/yugui/varyrubies/trunk-O0/lib/ruby/gems/1.9.1/gems/evil-ruby-0.1.0/lib",
#Encoding:UTF-8],
["/Users/yugui/varyrubies/trunk-O0/lib/ruby/site_ruby/1.9.1",
#Encoding:UTF-8],
["/Users/yugui/varyrubies/trunk-O0/lib/ruby/site_ruby/1.9.1/i386-darwin9.5.0",
#Encoding:UTF-8],
["/Users/yugui/varyrubies/trunk-O0/lib/ruby/site_ruby",
#Encoding:UTF-8],
["/Users/yugui/varyrubies/trunk-O0/lib/ruby/vendor_ruby/1.9.1",
#Encoding:UTF-8],
["/Users/yugui/varyrubies/trunk-O0/lib/ruby/vendor_ruby/1.9.1/i386-darwin9.5.0",
#Encoding:UTF-8],
["/Users/yugui/varyrubies/trunk-O0/lib/ruby/vendor_ruby",
#Encoding:UTF-8], ["/Users/yugui/varyrubies/trunk-O0/lib/ruby/1.9.1",
#Encoding:UTF-8],
["/Users/yugui/varyrubies/trunk-O0/lib/ruby/1.9.1/i386-darwin9.5.0",
#Encoding:UTF-8], [".", #Encoding:UTF-8]]

[[“enumerator.so”, #Encoding:ASCII-8BIT],
["/Users/yugui/varyrubies/trunk-O0/lib/ruby/1.9.1/i386-darwin9.5.0/enc/encdb.bundle",
#Encoding:US-ASCII],
["/Users/yugui/varyrubies/trunk-O0/lib/ruby/1.9.1/i386-darwin9.5.0/enc/trans/transdb.bundle",
#Encoding:US-ASCII],
["/Users/yugui/varyrubies/trunk-O0/lib/ruby/1.9.1/rubygems.rb",
#Encoding:UTF-8],
["/Users/yugui/varyrubies/trunk-O0/lib/ruby/1.9.1/i386-darwin9.5.0/enc/shift_jis.bundle",
#Encoding:US-ASCII],
["/Users/yugui/varyrubies/trunk-O0/lib/ruby/1.9.1/date/format.rb",
#Encoding:US-ASCII],
["/Users/yugui/varyrubies/trunk-O0/lib/ruby/1.9.1/date.rb",
#Encoding:US-ASCII]]

e$B%A%1%C%He(B #858 e$B$,99?7$5$l$^$7$?!#e(B (by Yuki S.)

e$B%9%F!<%?%9e(B Opene$B$+$ie(BClosede$B$KJQ99e(B
e$B?JD=e(B % 0e$B$+$ie(B100e$B$KJQ99e(B

Applied in changeset r20656.

http://redmine.ruby-lang.org/issues/show/858