[ruby-trunk - Bug #7646][Open] String#each lineでinvalid byte sequence

Issue #7646 has been reported by yoshidam (Yoshida Masato).


Bug #7646: String#each_lineでinvalid byte sequence

Author: yoshidam (Yoshida Masato)
Status: Open
Priority: Normal
Assignee:
Category:
Target version:
ruby -v: ruby 2.0.0dev (2013-01-02 trunk 38676) [i686-linux]

=begin
String#each_lineでセパレータを指定したときにASCII以外の文字でinvalid byte sequenceが発生します。

$ ruby -ve ‘“\n\u0100”.each_line(“\n”) {|l| p l }’
ruby 2.0.0dev (2013-01-02 trunk 38676) [i686-linux]
“\n”
-e:1:in each_line': invalid byte sequence in UTF-8 (ArgumentError) from -e:1:in

r38616あたりの変更で入ったバグのようです。

string.c.org 2012-12-27 21:57:07.000000000 +0900
+++ string.c 2013-01-02 23:36:47.000000000 +0900
@@ -6199,14 +6199,14 @@
if (c == newline &&
(rslen <= 1 ||
(pend - p >= rslen && memcmp(RSTRING_PTR(rs), p, rslen) ==
0))) {

  •       p += (rslen ? rslen : n);
    
  •       line = rb_str_subseq(str, s - ptr, p - s);
    
  •       const char *pp = p + (rslen ? rslen : n);
    
  •       line = rb_str_subseq(str, s - ptr, pp - s);
          if (wantarray)
              rb_ary_push(ary, line);
          else
              rb_yield(line);
          str_mod_check(str, ptr, len);
    
  •       s = p;
    
  •       s = pp;
      }
      p += n;
    
    }

=end

Issue #7646 has been updated by kosaki (Motohiro KOSAKI).

Category set to core
Status changed from Open to Assigned
Assignee set to nobu (Nobuyoshi N.)
Priority changed from Normal to High
Target version set to 2.0.0

これはどうみても regressionじゃないかな。
2.0.0タグつけます。

Bug #7646: String#each_lineでinvalid byte sequence

Author: yoshidam (Yoshida Masato)
Status: Assigned
Priority: High
Assignee: nobu (Nobuyoshi N.)
Category: core
Target version: 2.0.0
ruby -v: ruby 2.0.0dev (2013-01-02 trunk 38676) [i686-linux]

=begin
String#each_lineでセパレータを指定したときにASCII以外の文字でinvalid byte sequenceが発生します。

$ ruby -ve ‘“\n\u0100”.each_line(“\n”) {|l| p l }’
ruby 2.0.0dev (2013-01-02 trunk 38676) [i686-linux]
“\n”
-e:1:in each_line': invalid byte sequence in UTF-8 (ArgumentError) from -e:1:in

r38616あたりの変更で入ったバグのようです。

string.c.org 2012-12-27 21:57:07.000000000 +0900
+++ string.c 2013-01-02 23:36:47.000000000 +0900
@@ -6199,14 +6199,14 @@
if (c == newline &&
(rslen <= 1 ||
(pend - p >= rslen && memcmp(RSTRING_PTR(rs), p, rslen) ==
0))) {

  •       p += (rslen ? rslen : n);
    
  •       line = rb_str_subseq(str, s - ptr, p - s);
    
  •       const char *pp = p + (rslen ? rslen : n);
    
  •       line = rb_str_subseq(str, s - ptr, pp - s);
          if (wantarray)
              rb_ary_push(ary, line);
          else
              rb_yield(line);
          str_mod_check(str, ptr, len);
    
  •       s = p;
    
  •       s = pp;
      }
      p += n;
    
    }

=end