Strings, postgresql and gsub

Hi,

I’m somewhat stumped at the moment due to problems with inserting
strings in postgresql containing 's.

Normally, I’d just add a \ to escape the ’ and everything would be just
fine.
But for some reason, Ruby won’t let me do this.

ruby 1.8.6 (2007-09-24 patchlevel 111) [i686-linux]

irb(main):033:0* s = “blah’blah”
=> “blah’blah”
irb(main):034:0> s.gsub(/’/, “\’”)
=> “blahblahblah”
irb(main):035:0> s.gsub("’", “\’”)
=> “blahblahblah”

I’ve seen myself blind on this one …and I still can’t see where the
problem is. Any clues ?

On May 29, 2008, at 10:47, J-H Johansen wrote:

irb(main):033:0* s = “blah’blah”
=> “blah’blah”
irb(main):034:0> s.gsub(/’/, “\’”)
=> “blahblahblah”

String#gsub interprets the backslashes (for use with e.g. \1). In this
case it’s replaced with everything after the '. Escape it doubly to ge
t what you want:

s = “blah’blah”
=> “blah’blah”

puts s.gsub(/’/, “\\’”)
blah’blah

On Thu, May 29, 2008 at 10:47 AM, J-H Johansen [email protected]
wrote:

irb(main):033:0* s = “blah’blah”
=> “blah’blah”
irb(main):034:0> s.gsub(/‘/, "\’“)
=> “blahblahblah”
irb(main):035:0> s.gsub(”‘", "\’")
=> “blahblahblah”

I’ve seen myself blind on this one …and I still can’t see where the
problem is. Any clues ?

This use of gsub processes the escaping of the sub string twice,
the Ruby one for the literal string and a second one for backreferences,
so you’ll need:

irb(main):001:0> a = “blah’blah”
irb(main):002:0> a.gsub(“'”, “\\'”)
=> “blah\'blah”
irb(main):005:0> a.gsub(“'”, “\\'”).length
=> 10

Hope this helps,

Jesus.

From: J-H Johansen [mailto:[email protected]]

irb(main):034:0> s.gsub(/‘/, "\’")

=> “blahblahblah”

faq. try the block form for less headache.

irb(main):006:0> s.gsub(/‘/){"\’"}
=> “blah\'blah”

kind regards -botp

On Thursday 29 May 2008 03:47:59 J-H Johansen wrote:

Hi,

I’m somewhat stumped at the moment due to problems with inserting
strings in postgresql containing 's.

Somewhat offtopic, but you shouldn’t have to do this. First hit off
Google for
Postgres Ruby bindings shows support for bind values, if not prepared
statements.

So, if you’re doing this:

questionable_string.gsub!(… #try to escape stuff
connection.query “INSERT INTO my_table (some_column) VALUES
(#{questionable_string})”

Do this instead:

connection.query ‘INSERT INTO my_table (some_column) VALUES (?)’,
questionable_string

That way, either a library will do the substitution for you, or there’s
actually going to be some protocol used in the communication with the
Postgres server which avoids parsing the bind values as code.

My own SQL may be a little rusty, but the concept is the same.

Thanks to Jesús, Mikael, botp and David for clearing up a few things.
I also realized that my version of ruby postgres lib could use an
update.

=)

Cheers

2008/5/29 Peña, Botp [email protected]:

From: J-H Johansen [mailto:[email protected]]

irb(main):034:0> s.gsub(/‘/, "\’")

=> “blahblahblah”

faq. try the block form for less headache.

… and less performance. Proper escaping is the way to go.

irb(main):006:0> s.gsub(/‘/){"\’"}
=> “blah\'blah”

But in this case I agree with David: bind variables are much better
(also safe against SQL injection).

Cheers

robert

From: Robert K. [mailto:[email protected]]

2008/5/29 Peña, Botp [email protected]:

> faq. try the block form for less headache.

… and less performance.

by how much?

Benchmark.realtime{9999.times{s.gsub(/‘/){"\’“}}}
#=> 0.139999866485596
Benchmark.realtime{9999.times{s.gsub(/'/,”\\'")}}
#=> 0.108999967575073

… Proper escaping is the way to go.

yeh, while losing flexibility/easeofuse :frowning: that’s what’s giving me the
headache (especially now that i have four eyes and my head’s getting
heavier. “Proper” is relative, pick one that works best for you :slight_smile:

puts “\\'”
\’
#=> nil
puts “\'”
'

i’d wish ruby2 would do

s.gsub(/‘/)("\’")
#=> “blah\'blah”

ie, backreferences and company should use the more flexible block form

s.gsub(/‘/){"\’"}
#=> “blah\'blah”

kind regards -botp

On 30.05.2008 03:16, Peña wrote:

Benchmark.realtime{9999.times{s.gsub(/’/,"\\’")}}
#=> 0.108999967575073

robert@fussel ~
$ ruby -r benchmark <<XXX

Benchmark.bmbm do |r|
r.report “block” do
1_000_000.times { “foobarbaz”.gsub(/o/){‘X’} }
end
r.report “str” do
1_000_000.times { “foobarbaz”.gsub(/o/,‘X’) }
end
end
XXX
Rehearsal -----------------------------------------
block 10.438000 0.000000 10.438000 ( 10.583000)
str 5.109000 0.000000 5.109000 ( 5.310000)
------------------------------- total: 15.547000sec

         user     system      total        real

block 10.578000 0.000000 10.578000 ( 10.816000)
str 5.219000 0.000000 5.219000 ( 5.326000)

robert@fussel ~
$

Similar results for 1.8 and 1.9.

… Proper escaping is the way to go.

yeh, while losing flexibility/easeofuse :frowning: that’s what’s giving me the headache (especially now that i have four eyes and my head’s getting heavier. “Proper” is relative, pick one that works best for you :slight_smile:

My reasoning is as follows: you need to use the block form if you need
to calculate the replacement string for every match. If the
replacement follows a particular pattern or is constant use the string
form.

Kind regards

robert