[Open] Strings#codepoints doesn't respect BOM on UTF-{16, 32} pseudo encodings

チケット #9415 が Nobuyoshi N. によって報告されました。


Bug #9415: Strings#codepoints doesn’t respect BOM on UTF-{16,32} pseudo
encodings
https://bugs.ruby-lang.org/issues/9415

  • 作成者: Nobuyoshi N.
  • ステータス: Open
  • 優先度: Normal
  • 担当者: Yui NARUSE
  • カテゴリ: M17N
  • 対象バージョン: current: 2.2.0
  • ruby -v: r44601
  • Backport: 1.9.3: UNKNOWN, 2.0.0: UNKNOWN, 2.1: REQUIRED

String#codepointsUTF-16UTF-32でのBOMを考慮していません。

$ ruby -e 'puts "%x" % 
"\u{feff}".encode("UTF-16BE").force_encoding("UTF-16").codepoints'
feff
$ ruby -e 'puts "%x" % 
"\u{feff}".encode("UTF-16LE").force_encoding("UTF-16").codepoints'
fffe

String#ordなども同様です。

$ ruby -e 'printf "%x\n", 
"\u{feff}".encode("UTF-16BE").force_encoding("UTF-16").ord'
feff
$ ruby -e 'printf "%x\n", 
"\u{feff}".encode("UTF-16LE").force_encoding("UTF-16").ord'
fffe

This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.

| Privacy Policy | Terms of Service | Remote Ruby Jobs