Gsub! and quoting question

I thought this would be a trivial task, but it seems to be more
difficult than expected:

I have a variable, data, containing some string. I would like to
preprend every
single apostrophe in this string by a backslash. Here is my solution:

data.gsub!(/’/,%q(\’))

Strangely, this does not work. I tested it with data containing a string
consisting
of a single quote solely, and surrounded the code by “puts” like this:

data=%q(’)
puts “:”+data+" replace quotes by "+%q(\’)
data.gsub!(/’/,%q(\’))
puts “data length now #{data.length}”

This produced as output:

:’ replace quotes by ’
data length now 0

From this I conclude that the gsub! had shortened the string to length
zero.
Any explanation for this? How do I solve my problem in a proper way?

Ronald

2007/9/10, Ronald F. [email protected]:

consisting
data length now 0

From this I conclude that the gsub! had shortened the string to length
zero.
Any explanation for this? How do I solve my problem in a proper way?

This comes up frequently. You need to be aware that there are several
levels of interpretation involved and so you need multiple levels of
escaping. This is sometimes obscured by the fact that ‘\1’ works
although it should read ‘\1’. These levels are: 1. string escaping,
2. regexp replacement string escaping. All these variants do work:

str.gsub /‘/, ‘\\\&’
str.gsub /’/, ‘\\'’
str.gsub /‘/, "\\’"

Kind regards

robert

From this I conclude that the gsub! had shortened the
string to length
zero.
Any explanation for this? How do I solve my problem in a proper way?

This comes up frequently.

I’m sorry if I posted a question which has been answered already :frowning:

You need to be aware that there are several
levels of interpretation involved and so you need multiple levels of
escaping. This is sometimes obscured by the fact that ‘\1’ works
although it should read ‘\1’. These levels are: 1. string escaping,
2. regexp replacement string escaping.

Hmmm… I would understand your argument if I had a digit following
a backslash (which would mean a backreference to the n-th group in
the pattern), but here we have ', and this is not a back reference,
isn’t it? At least the discussion of String#gsub at
http://www.ruby-doc.org/core/
mentions only \1,\2,… etc. as back references.

Ronald

Hmmm… I would understand your argument if I had a digit following
a backslash (which would mean a backreference to the n-th group in
the pattern), but here we have ', and this is not a back reference,
isn’t it? At least the discussion of String#gsub at
RDoc Documentation
mentions only \1,\2,… etc. as back references.

No. You want the backslash to be interpreted literally by the regexp
engine, i.e. you want to prevent the regexp engine from identifying
the backslash as a meta character.

OK

This is independent from the
character after the backslash.

Ah, that’s the point I was missing. Now things are clear!

Btw, not only digits are meaningful in replacement patterns, but
single quote is as well (I believe it’s the postmatch):

I already suspected this too (knowing this concept from Perl), but
didn’t
find anything in the Ruby online docs where this would be discussed.
But this explains why my string was replaced by the empty string…

Ronald

On Sep 10, 7:55 am, “Robert K.” [email protected]
wrote:

although it should read ‘\1’. These levels are: 1. string escaping,
2. regexp replacement string escaping. All these variants do work:

str.gsub /‘/, ‘\\\&’
str.gsub /’/, ‘\\'’
str.gsub /‘/, "\\’"

str.gsub /(?=')/, “\”

2007/9/10, Ronald F. [email protected]:

From this I conclude that the gsub! had shortened the
escaping. This is sometimes obscured by the fact that ‘\1’ works
although it should read ‘\1’. These levels are: 1. string escaping,

  1. regexp replacement string escaping.

Hmmm… I would understand your argument if I had a digit following
a backslash (which would mean a backreference to the n-th group in
the pattern), but here we have ', and this is not a back reference,
isn’t it? At least the discussion of String#gsub at
RDoc Documentation
mentions only \1,\2,… etc. as back references.

No. You want the backslash to be interpreted literally by the regexp
engine, i.e. you want to prevent the regexp engine from identifying
the backslash as a meta character. This is independent from the
character after the backslash.

First, you need to escape every backslash in a string so it is
inserted literally (makes 2 backslashes). Then you need to escape a
backslash because otherwise the regexp engine will interpret it (makes
2*2=4 backslashes). Finally you need an additional backslash to ensure
the single quote is inserted into the string.

Btw, not only digits are meaningful in replacement patterns, but
single quote is as well (I believe it’s the postmatch):

irb(main):012:0> puts “abc”.gsub(/b/, ‘\'’)
acc
=> nil
irb(main):013:0> puts “abc”.gsub(/b/, ‘\`’)
aac
=> nil

Cheers

robert