Ruby 1.9 - ArgumentError: incompatible encoding regexp match (US-ASCII regexp with ISO-2022-JP strin

Hiya all,

I am testing TMail against the latest Ruby (1.9.0 downloaded last
night), and have come up against this problem:

ArgumentError: incompatible encoding regexp match (US-ASCII regexp
with ISO-2022-JP string)

Now, I can think of a couple of ways to do this, but has anyone else
run into this problem and has a nice elegant solution?

I don’t really want to set the regexp to UTF-8 or something and then
transliterate the match strings as that just isn’t going to scale I
think when you are talking about emails which can have almost anything
in them, and making a regexp for every encoding type also isn’t the
solution.

This only comes up in the 1.9.0 from last night, 1.9.0 from about
January does not have this issue.

The method that is failing is:

def encode_value( str )
  str.gsub(TOKEN_UNSAFE) {|s| '%%%02x' % s[0] }
end

And TOKEN_UNSAFE is defined as:

tspecial     = %Q|()<>[];:\\,"/?=|
lwsp         = %Q| \t\r\n|
control      = %Q|\x00-\x1f\x7f-\xff|

TOKEN_UNSAFE  = /[#{Regexp.quote tspecial}#{control}#{lwsp}]/n

Which already has the ‘n’ switch…

And the failing test is in test_encode.rb (for anyone with TMail
installed) and looks like this:

def test_s_encode
SRCS.each_index do |i|
assert_equal crlf(OK[i]),
TMail::Encoder.encode(NKF.nkf(’-j’, SRCS[i]))
end
end

def crlf( str )
str.gsub(/\n|\r\n|\r/) { “\r\n” }
end

Which is using the string:

SRCS = [“a cde
\343\201\202\343\201\204\343\201\206\343\201\210\343\201\212\343\201\202\343\201\204\343\201\206\343\201\210\343\201\212\343\201\202\343\201\204\343\201\206\343\201\210\343\201\212\343\201\202\343\201\204\343\201\206\343\201\210\343\201\212\343\201\202\343\201\204\343\201\206\343\201\210\343\201\212”]

To match against:

OK = [
“a cde
=?iso-2022-jp?B?GyRCJCIkJCQmJCgkKiQiJCQkJiQoJCokIiQkJCYkKCQqJCIbKEI=?=\n\t=?iso-2022-jp?B?GyRCJCQkJiQoJCokIiQkJCYkKCQqGyhC?=”,
#1
]

Regards

Mikel