Strings#codepoints doesn't respect BOM on UTF-{16, 32} pseudo encodings

Issue #9415 has been updated by Yui NARUSE.

Backport changed from 1.9.3: REQUIRED, 2.0.0: REQUIRED, 2.1: REQUIRED to
1.9.3: REQUIRED, 2.0.0: REQUIRED, 2.1: DONE

r45074


Bug #9415: Strings#codepoints doesn’t respect BOM on UTF-{16,32} pseudo
encodings

  • Author: Nobuyoshi N.
  • Status: Closed
  • Priority: Normal
  • Assignee: Yui NARUSE
  • Category: M17N
  • Target version: current: 2.2.0
  • ruby -v: -
  • Backport: 1.9.3: REQUIRED, 2.0.0: REQUIRED, 2.1: DONE

String#codepointsUTF-16UTF-32でのBOMを考慮していません。

$ ruby -e 'puts "%x" % 
"\u{feff}".encode("UTF-16BE").force_encoding("UTF-16").codepoints'
feff
$ ruby -e 'puts "%x" % 
"\u{feff}".encode("UTF-16LE").force_encoding("UTF-16").codepoints'
fffe

String#ordなども同様です。

$ ruby -e 'printf "%x\n", 
"\u{feff}".encode("UTF-16BE").force_encoding("UTF-16").ord'
feff
$ ruby -e 'printf "%x\n", 
"\u{feff}".encode("UTF-16LE").force_encoding("UTF-16").ord'
fffe