Gsub problem

lalawawa · January 25, 2010, 5:50pm

I’m trying to use gsub to put escapes before single quotes.

irb(main):001:0> a = “Ain’t Can’t”
=> “Ain’t Can’t”
irb(main):002:0> a.gsub(/(’)/, ‘\1’)
=> “Ain’t Can’t”
irb(main):003:0> a.gsub(/(’)/, ‘\\1’)
=> “Ain\1t Can\1t”
irb(main):004:0> a.gsub(/(’)/, ‘\\1’)
=> “Ain\1t Can\1t”
irb(main):005:0> a.gsub(/(’)/, ‘\\\1’)
=> “Ain\'t Can\'t”
irb(main):006:0>

How does one do it?

lalawawa · January 25, 2010, 6:03pm

Actually, I oversimplified my example. I want to escape single quotes
and question marks.

irb(main):001:0> a = “Can’t we?”
=> “Can’t we?”
irb(main):002:0> a.gsub(/([’?])/, ‘\1’)
=> “Can’t we?”
irb(main):003:0> a.gsub(/([’?])/, ‘\\1’)
=> “Can\1t we\1”
irb(main):004:0> a.gsub(/([’?])/, ‘\\1’)
=> “Can\1t we\1”
irb(main):005:0> a.gsub(/([’?])/, ‘\\\1’)
=> “Can\'t we\?”
irb(main):006:0>

lalawawa · January 25, 2010, 6:30pm

On 2010-01-25, lalawawa [email protected] wrote:

irb(main):005:0> a.gsub(/(['?])/, ‘\\\1’)
=> “Can\'t we\?”

This is probably what you want – keep in mind that irb will show
you escaped backslashes if there’s a single backslash in the string.

-s

lalawawa · January 25, 2010, 6:52pm

Seebs wrote:

On 2010-01-25, lalawawa [email protected] wrote:

irb(main):005:0> a.gsub(/(['?])/, ‘\\\1’)
=> “Can\'t we\?”

This is probably what you want – keep in mind that irb will show
you escaped backslashes if there’s a single backslash in the string.

-s

You’re right.

irb(main):009:0> puts a.gsub(/(['?])/, ‘\\\1’)
Can't we?
=> nil
irb(main):010:0>

Thanks

lalawawa · January 25, 2010, 7:03pm

On Jan 25, 2010, at 12:50 PM, lalawawa wrote:

irb(main):009:0> puts a.gsub(/(['?])/, ‘\\\1’)
Can't we?
=> nil
irb(main):010:0>

Thanks

You can get rid of a few backslashes if you like:

$ irb --simple-prompt

a = “Can’t we?”
=> “Can’t we?”
puts a.gsub(/['?]/) { |c| “\#{c}” }
Can't we?
=> nil

Mike

–

Mike S. [email protected]
http://www.stok.ca/~mike/

The “`Stok’ disclaimers” apply.

lalawawa · January 25, 2010, 7:35pm

On Jan 25, 2010, at 12:00 PM, lalawawa wrote:

=> “Can\1t we\1”
irb(main):005:0> a.gsub(/(['?])/, ‘\\\1’)
=> “Can\'t we\?”

(Well, Mike’s answer steals my thunder a bit, but I’ll put this out
there for anyone else that finds this thread.)

You just have to remember that there’s two levels of confusion, uh, I
mean escaping going on here.

Within single-quoted strings, the only backslash-escape is for a ’
and, so you can have a , you also need to escape the escape character:

‘\1’ means a backslash and a digit (because the character following
the \ is not ’ or \ it doesn’t have its special meaning)
‘\1’ means an escaped backslash and a digit (yes, the same as above,
but this time the first \ escapes the second \ so that it is
interpreted literally)
‘\\1’ means an escaped backslash, a backslash, and a digit (you see
where this is going, right?)

Now, within the replacement string for a gsub, you can have a
backslash-digit to mean “the n-th parenthesized group where the digit
is n”

So you want to replace with “a backslash and a backslash-digit for the
first group”. You need the final interpretation of the replacement to
end up as \ \1 (no quotes here to confuse things)

‘\’ is a literal
‘\\’ is a literal \ so gsub will see a real
‘\1’ is \1 because 1 isn’t a special character
SO…
‘\\\1’ is seen as \\1 (as the argument) and gsub interprets it as
“literal backslash, first group”

‘\\\1’ is seen the same way; the fifth backslash escapes the sixth,
then the 1

You might consider using the block form of gsub replacement.

irb> a = “Can’t we?”
=> “Can’t we?”
irb> a.gsub(/([‘?])/, ‘\\\1’)
=> “Can\'t we\?”
irb> puts _
Can't we?
=> nil
irb> a.gsub(/[’?]/) {|m| ‘\’ + m}
=> “Can\'t we\?”
irb> puts _
Can't we?
=> nil

In your case, the clarity that results from the simplification of the
replacement should be obvious.

-Rob

Rob B. http://agileconsultingllc.com
[email protected]

lalawawa · January 27, 2010, 2:05am

Rob B. wrote:

=> “Can\1t we\1”
You just have to remember that there’s two levels of confusion, uh, I
‘\\1’ means an escaped backslash, a backslash, and a digit (you see
‘\’ is a literal \

Can't we?
=> nil

In your case, the clarity that results from the simplification of the
replacement should be obvious.

-Rob

Rob B. http://agileconsultingllc.com
[email protected]

Most excellent and informative posts, Rob and Mike. I’ve saved them
both to my “best of Ruby” folder.