Forum: Ruby Character encoding in 1.9

Announcement (2017-05-07): is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see and for other Rails- und Ruby-related community platforms.
D1293dba7ab354e671044619fc1a00d7?d=identicon&s=25 Andrew S. (andrew_s)
on 2012-12-24 21:34
I have some code in 1.8 that would strip certain special characters out
of a string (escape sequences in Telnet specifically, such as

I've moved to 1.9 and now that code fails with "invalid multibyte

I've looked at some of the tutorials regarding encodings in 1.9 and to
be honest I find them very intimidating.  I'm hoping someone can suggest
a quick fix?

Here's the code:

[3,4,5,6,7].each do |n|
    line.gsub!(/#{('\377\37' + n.to_s + '\001')}/,"")

Many thanks in advance!

 - Andrew
A226fe2449820fed93c25c6cbf3ca6e5?d=identicon&s=25 Nathan Beyer (Guest)
on 2012-12-25 05:29
(Received via mailing list)
Check out the Regexp documentation, specifically the 'new'/'compile'
method: I
think you'll need to use Regexp.compile to create an "encoding-less"
regular expression.

I would guess something like this -

[3,4,5,6,7].each do |n|
    line.gsub!(Regexp.compile('\377\37' + n.to_s + '\001', nil, 'n'),"")

You'll have to fiddle with it, as I didn't have a chance to test it out
see if this actually works. I'd have to know the encoding the 'line'
variable to accurately test it anyway.
54404bcac0f45bf1c8e8b827cd9bb709?d=identicon&s=25 7stud -- (7stud)
on 2012-12-25 09:15
line = "hello\377\373\001 \377\374\001 \377\375\001world"

(3..7).each do |n|
  pattern = "\377" << (0370 + n).chr << "\001"
  line.gsub! pattern, " "
  p line

"hello  \377\374\001 \377\375\001world"
"hello    \377\375\001world"
"hello     world"
"hello     world"
"hello     world"
54404bcac0f45bf1c8e8b827cd9bb709?d=identicon&s=25 7stud -- (7stud)
on 2012-12-25 10:09
line = "hello\377\373\001 \377\374\001 \377\375\001world"
p line

regex = /\377[\373-\377]\001/

line.gsub! regex, ''
p line

"hello\xFF\xFB\x01 \xFF\xFC\x01 \xFF\xFD\x01world"
"hello  world"
This topic is locked and can not be replied to.