Unintuitive language feature (exclamation functions)

nicken · August 20, 2008, 9:48pm

I was surprised to discover that the code

astring.sub!(/hi/, 'bye')

behaves subtly differently from

astring = astring.sub(/hi/, 'bye')

Intuitively, to me, these should be identical. Perhaps the documentation
should make mention of this difference? A note about this unexpected
behavior would have saved me a lot of frustration, and would likely do
the same for many others new to Ruby.

To be honest, I’m still trying to find out exactly why these do
different things. The difference does not manifest itself with trivial
cases in irb; rather it shows up when I’m getting a string from cgi,
modifying it, then inserting it into a database. When using sub!, the
database ends up containing the pre-sub’d value of astring, even though
astring appears to contain the modified version when printed with a
debug statement immediately preceding my database insert.

I’m willing to except the criticism that my intuition is perverse in
some way, but when I started writing in Ruby I was really hoping it
would be a language one could use without having to understand how the C
underneath it all worked (defeating part of the purpose of “high level”
languages).

So what do you think? Would warnings in the documentation on exclamation
functions be useful or pointless?

nicken · August 20, 2008, 10:00pm

Le 20 août 2008 à 21:45, Nick B. a écrit :

When using sub!, the
database ends up containing the pre-sub’d value of astring, even though
astring appears to contain the modified version when printed with a
debug statement immediately preceding my database insert.

Please provide some code to demonstrate this. I’m willing to bet
there’s another, subtler, step that’s misleading you.

Fred

nicken · August 20, 2008, 10:06pm

On Wed, Aug 20, 2008 at 3:45 PM, Nick B.
[email protected] wrote:

behavior would have saved me a lot of frustration, and would likely do
the same for many others new to Ruby.

If the two were identical, why would we have both sub and sub! methods?
The extra punctuation would be useless if it existed ‘just for fun’

To be honest, I’m still trying to find out exactly why these do
different things. The difference does not manifest itself with trivial
cases in irb; rather it shows up when I’m getting a string from cgi,
modifying it, then inserting it into a database. When using sub!, the
database ends up containing the pre-sub’d value of astring, even though
astring appears to contain the modified version when printed with a
debug statement immediately preceding my database insert.

The documentation for String#sub! is:

"Performs the substitutions of String#sub in place, returning str, or
nil if no substitutions were performed. "

I’m willing to except the criticism that my intuition is perverse in
some way, but when I started writing in Ruby I was really hoping it
would be a language one could use without having to understand how the C
underneath it all worked (defeating part of the purpose of “high level”
languages).

This has nothing to do with C. It has to do with interface design,
and is meant to make things more intuitive, not less.
Admittedly there is nothing inherently intuitive about some_method!,
except that it might make you feel like you should pay more attention,
like… Caution!

Once learned, this convention can be very helpful.

So what do you think? Would warnings in the documentation on exclamation
functions be useful or pointless?

Since exclamation points are conventional and not behaviorly enforced
in any way by Ruby itself, all ! methods should come with their own
documentation.
It does not necessarily mean ‘modify the receiver in place’, so
further explanation is usually needed. Just remember that when you
see foo and foo!, the latter is the one that the developer of the
library you are using has indicated to require more attention, or be
more specialized.

If you’re still not convinced, I recommend checking out a post by
David Black on this topic, as it clearly explains the value of the
convention:
http://dablog.rubypal.com/2007/8/15/bang-methods-or-danger-will-rubyist

-greg

nicken · August 20, 2008, 10:22pm

On Wednesday 20 August 2008, Nick B. wrote:

behavior would have saved me a lot of frustration, and would likely do
I’m willing to except the criticism that my intuition is perverse in
some way, but when I started writing in Ruby I was really hoping it
would be a language one could use without having to understand how the C
underneath it all worked (defeating part of the purpose of “high level”
languages).

So what do you think? Would warnings in the documentation on exclamation
functions be useful or pointless?

Unless I misunderstood you, you’re asking why two different methods
(String#sub and String#sub!) work differently. The answer is simple:
because
they’re different. It’s like asking why String#upcase and
String#downcase work
differently.

The documentation do speak of this difference:

ri String#sub gives:

------------------------------------------------------------- String#sub
str.sub(pattern, replacement) => new_str
str.sub(pattern) {|match| block } => new_str

 Returns a copy of _str_ with the _first_ occurrence of _pattern_
 replaced with either _replacement_ or the value of the block. [...]

while ri String#sub! gives:

------------------------------------------------------------ String#sub!
str.sub!(pattern, replacement) => str or nil
str.sub!(pattern) {|match| block } => str or nil

 Performs the substitutions of +String#sub+ in place, returning
 _str_, or +nil+ if no substitutions were performed.

You don’t need to know about the C implementation of class String, of
String#sub or of String#sub! to understand how these methods work. The
documentation says that sub returns a copy of the string with the
replacement
done, which means a different object, which has nothing to do with the
original. In the case of sub!, instead, the substitution is done in
place,
that is, the receiver itself (str) is modified, not a copy of it.

As for the fact that the difference doesn’t show in irb, this is not
true.
Look at this:

irb(main):001:0> str = “this is a test string”
=> “this is a test string”
irb(main):002:0> str1 = str.sub “h”, “H”
=> “tHis is a test string”
irb(main):003:0> str
=> “this is a test string”

The above lines show that str is not changed by sub

irb(main):004:0> str.sub “k”, “K”
=> “this is a test string”
irb(main):005:0> str.sub! “k”, “K”
=> nil

This shows the different behavior concerning the return value when
there’s
nothing to replace. sub returns a copy of the string without
modifications,
while sub! returns nil

irb(main):006:0> str.sub! “a”, “A”
=> “this is A test string”
irb(main):007:0> str
=> “this is A test string”
irb(main):008:0>

Here you can see that sub!, unlike sub, changes the original string.

In short, here’s the difference between sub and sub!:

sub creates a new string which has the same contents of the original
one,
but is indipendent from, then replaces the pattern with the replacement
text
in the copy. The original is not altered in any way. It always returns
the
copy and you can see whether a replacement has been made by comparing
the
original and the copy.
sub! performs the replacement on the string itself, thus changing it.
Obviously, you can’t compare the ‘new’ and the ‘original’ string to see
whether a replacement has been made (since there’s no ‘new string’ and
the
original has been changed), so you have to look at the return value: if
it is
nil, nothing has been changed; if it is the string itself then a
replacement
has been made.

I hope this helps

Stefano

nicken · August 20, 2008, 10:55pm

On Wednesday 20 August 2008, Nick B. wrote:

sql = “insert into example (aval) values (?)”
cgi session. You will get the output:

Inserting value a=bye into the database.
What was actually inserted into the database: hi

Since other responders seem to think I expect sub to behave the same as
sub!, I don’t. I expect str.sub! to modify str, and I expect str.sub to
return a modified copy of the str. This is not the same behavior.

If in your first post you’d have stated more clearly what you expected
and
what you instead got, we wouldn’t have misunderstood your needs. After
all,
the only (or, at least, main) difference between sub and sub! is the one
I
spoke of in my other answer. However, I can’t try your code, as I don’t
have
the sqlite gem/library. Would you please post what you get using sub and
what
you get using sub!?

The line

puts “Inserting value a=#{a} into the database.\n”

displays the correct value (a=bye). If I understand you correctly, the
surprising behavior comes from inserting it in the database. Posting
what you
get from the other puts will enable also those who don’t have sqlite to
help
you.

(By the way, you don’t need to put the \n at the end of the string with
puts).

Stefano

nicken · August 20, 2008, 11:00pm

Le 20 août 2008 à 22:25, Nick B. a écrit :

F. Senault wrote:

Please provide some code to demonstrate this.

Don’t ask me why (yet) but…

a = cgi[‘a’]
a = cgi[‘a’].dup

and…

22:47 fred@balvenie:~/> ruby test.rb
(offline mode: enter name=value pairs on standard input)
a=hi
Inserting value a=bye into the database.
What was actually inserted into the database: bye

…

It seems that CGI does horrible, horrible things to its strings :

require ‘cgi’
cgi = CGI.new(‘html4’)

a = cgi[‘a’] #.dup
b = cgi[‘a’].dup

puts “A :”
a.sub!(/hi/, ‘bye’)
puts a.to_s
puts a.inspect
puts a.class

puts “B :”
b.sub!(/hi/, ‘bye’)
puts b.to_s
puts b.inspect
puts b.class

Gives :

22:53 fred@balvenie:~> ruby test.rb
(offline mode: enter name=value pairs on standard input)
a=hi
A :
hi
“bye”
String
B :
bye
“bye”
String

Ugh !

Fred

nicken · August 20, 2008, 11:01pm

I would agree with Stefano. I doesn’t look like an issue with sub and
sub! to me.
I ran into something similar with my webapp. For me, it was because I
didn’t call database.commit() after my update statement.

nicken · August 20, 2008, 10:28pm

F. Senault wrote:

Please provide some code to demonstrate this.

#!/usr/bin/env ruby

require ‘sqlite3’
db = SQLite3::Database.new(‘test.sqlite’)
db.execute (‘drop table if exists example’) # clean up incase of
multiple runs
db.execute(‘create table example (aval)’)

require ‘cgi’
cgi = CGI.new(‘html4’)

a = cgi[‘a’]

a.sub!(/hi/, ‘bye’)

to see expected behavior, replace the above with: a = a.sub(/hi/,

‘bye’)

puts “Inserting value a=#{a} into the database.\n”
sql = “insert into example (aval) values (?)”
db.execute(sql, a)

sql = “select aval from example”
val = db.get_first_value(sql)
puts “What was actually inserted into the database: #{val}\n”

########---------- end of code

To run this, type “a=hi”[enter][ctrl-d] to simulate the behavior of a
cgi session. You will get the output:

Inserting value a=bye into the database.
What was actually inserted into the database: hi

Since other responders seem to think I expect sub to behave the same as
sub!, I don’t. I expect str.sub! to modify str, and I expect str.sub to
return a modified copy of the str. This is not the same behavior.

nicken · August 20, 2008, 11:04pm

I’m starting to wonder if this is actually a bug in Ruby? The
documentation of sub! says it should modify the string in place.

The code I posted does something different. After the a.sub!, executing
“puts #{a}” outputs the modified version of a, but inserting that exact
same string object into a database puts an UNMODIFIED version of the
string into the DB. It’s as if db.execute looks back in time to before
the sub! when it gets the value of a. Something unexplained is going on
here (unless the database module includes a time machine).

nicken · August 20, 2008, 11:03pm

On Wed, Aug 20, 2008 at 4:25 PM, Nick B.
[email protected] wrote:

F. Senault wrote:

Please provide some code to demonstrate this.

a = cgi[‘a’]

Internal to the CGI object, it appears that “a” in the @params hash is
an array of strings not a string:

irb(main):007:0> cgi = CGI.new(‘html4’)
(offline mode: enter name=value pairs on standard input)
a=foohibyebar
=> #<CGI:0xb7c998cc @params={“a”=>[“foohibyebar”]}, @multipart=false,
@output_cookies=nil, @output_hidden=nil, @cookies={}>

a.sub!(/hi/, ‘bye’)

to see expected behavior, replace the above with: a = a.sub(/hi/,

‘bye’)

To run this, type “a=hi”[enter][ctrl-d] to simulate the behavior of a
cgi session. You will get the output:

Inserting value a=bye into the database.
What was actually inserted into the database: hi

cat z.rb
#!/usr/bin/env ruby

require ‘rubygems’
require ‘sqlite3’

db = SQLite3::Database.new(‘test.sqlite’)
db.execute (‘drop table if exists example’) # clean up incase of
multiple runs
db.execute(‘create table example (aval)’)

require ‘cgi’
cgi = CGI.new(‘html4’)

a = cgi[‘a’][0]

a.sub!(/hi/, ‘bye’)

to see expected behavior, replace the above with: a = a.sub(/hi/,

‘bye’)

puts “Inserting value a=#{a} into the database.\n”
sql = “insert into example (aval) values (?)”
db.execute(sql, a)

sql = “select aval from example”
val = db.get_first_value(sql)
puts “What was actually inserted into the database: #{val}\n”

ruby z.rb
z.rb:7: warning: don’t put space before argument parentheses
(offline mode: enter name=value pairs on standard input)
a=foohibyebar
z.rb:13:CAUTION! cgi[‘key’] == cgi.params[‘key’][0]; if want Array,
use cgi.params[‘key’]
Inserting value a=foobyebyebar into the database.
What was actually inserted into the database: foobyebyebar

nicken · August 20, 2008, 11:06pm

On Wed, Aug 20, 2008 at 5:03 PM, [email protected] wrote:

irb(main):007:0> cgi = CGI.new(‘html4’)
(offline mode: enter name=value pairs on standard input)
a=foohibyebar
=> #<CGI:0xb7c998cc @params={“a”=>[“foohibyebar”]}, @multipart=false,
@output_cookies=nil, @output_hidden=nil, @cookies={}>

ruby z.rb
z.rb:7: warning: don’t put space before argument parentheses
(offline mode: enter name=value pairs on standard input)
a=foohibybar
a=zizzlesticks
z.rb:13:CAUTION! cgi[‘key’] == cgi.params[‘key’][0]; if want Array,
use cgi.params[‘key’]
Inserting value a=foobyebybar into the database.
What was actually inserted into the database: foobyebybar

nicken · August 20, 2008, 11:36pm

So the problem doesn’t seem to be with sub! at all. It’s with cgi.

If I get the variable “a” using the cgi code above, then I create string
“b”:

a = cgi[‘a’]
b = String.new

a.class
=> String
b.class
=> String

So they should have the same methods since they are the same class.
However, “a” seems to have extra methods.
a.first
=> “hi”
b.first
NoMethodError: undefined method `first’ for “”:String

It seems the cgi object is returning strings that aren’t really strings.
If that’s the case, isn’t it a bug that cgi[‘a’].class returns “String”
when it is really something else?

nicken · August 20, 2008, 11:39pm

On Wed, Aug 20, 2008 at 5:36 PM, [email protected] wrote:

p *bind_vars
bind_vars.flatten.each do |var|
p var

irb(main)> cgi = CGI.new(‘html4’)
(offline mode: enter name=value pairs on standard input)
a=hi
=> #<CGI:0xb7c367b8 @params={“a”=>[“hi”]}, @multipart=false,
@output_cookies=nil, @output_hidden=nil, @cookies={}>

irb(main)> a = cgi[‘a’]
=> “hi”

irb(main)> a.sub!(/hi/, ‘bye’)
=> “bye”

irb(main)> a
=> “bye”

irb(main)> [a]
=> [“bye”]

irb(main)> [a].flatten
=> [“hi”]

nicken · August 20, 2008, 11:38pm

On Wed, Aug 20, 2008 at 5:01 PM, Nick B.
[email protected] wrote:

“puts #{a}” outputs the modified version of a, but inserting that exact
same string object into a database puts an UNMODIFIED version of the
string into the DB. It’s as if db.execute looks back in time to before
the sub! when it gets the value of a. Something unexplained is going on
here (unless the database module includes a time machine).

#!/usr/bin/env ruby

require ‘rubygems’
require ‘sqlite3’

module SQLite3
class Statement
def bind_params( *bind_vars )
index = 1
p self.class, “bind_params()”
p bind_vars
p *bind_vars
bind_vars.flatten.each do |var|
p var
if Hash === var
var.each { |key, val| bind_param key, val }
else
bind_param index, var
index += 1
end
end
end
end
end

db = SQLite3::Database.new(‘test.sqlite’)
db.execute (‘drop table if exists example’) # clean up incase of
multiple runs
db.execute(‘create table example (aval)’)

require ‘cgi’
cgi = CGI.new(‘html4’)

a = cgi[‘a’]

a.sub!(/hi/, ‘bye’)

to see expected behavior, replace the above with: a = a.sub(/hi/,

‘bye’)

puts “Inserting value a=#{a} into the database.\n”
sql = “insert into example (aval) values (?)”
db.execute(sql, a)

sql = “select aval from example”
val = db.get_first_value(sql)
puts “What was actually inserted into the database: #{val}\n”

(offline mode: enter name=value pairs on standard input)
a=hi
Inserting value a=bye into the database.
SQLite3::Statement
“bind_params()”
[“bye”]
“bye”
“hi”
What was actually inserted into the database: hi

nicken · August 20, 2008, 11:40pm

On Wed, Aug 20, 2008 at 5:38 PM, [email protected] wrote:

=> #<CGI:0xb7c367b8 @params={“a”=>[“hi”]}, @multipart=false,

irb(main)> [a]
=> [“bye”]

irb(main)> [a].flatten
=> [“hi”]

irb(main):061:0> b = “hi”
=> “hi”
irb(main):062:0> b.sub!(/hi/, ‘bye’)
=> “bye”
irb(main):063:0> [b]
=> [“bye”]
irb(main):064:0> [b].flatten
=> [“bye”]

nicken · August 20, 2008, 11:54pm

And just for comparison:

require 'cgi'
cgi = CGI.new('html4')

a = cgi['a']

a.sub!(/hi/, 'bye')

p a
p [a]
p [a].flatten

ruby:
(offline mode: enter name=value pairs on standard input)
a=hi
“bye”
[“bye”]
[“hi”]

jruby:
(offline mode: enter name=value pairs on standard input)
a=hi
“bye”
[“bye”]
[“hi”]

oh, and also:
irb(main):001:0> a = “hi”
=> “hi”
irb(main):002:0> a.sub!(/hi/, “bye”)
=> “bye”
irb(main):003:0> a
=> “bye”
irb(main):004:0> [a]
=> [“bye”]
irb(main):005:0> [a].flatten
=> [“bye”]

nicken · August 20, 2008, 11:47pm

On Wednesday 20 August 2008, Nick B. wrote:

b.class
It seems the cgi object is returning strings that aren’t really strings.
If that’s the case, isn’t it a bug that cgi[‘a’].class returns “String”
when it is really something else?

The string returned by Cgi#[] are extended by the
CGI::QueryExtension::Value
module, which is an intentionally undocumented module defined in cgi.rb
and
adds methods like to_ary, first and last and modifies others (like []).
I
don’t know whether this is documented or not, since I’ve never used this
library. This fact, however, doesn’t explain (at least I don’t think so)
the
weird behaviors which have been reported in this thread.

Stefano

nicken · August 21, 2008, 12:47am

On Wed, Aug 20, 2008 at 5:33 PM, Nick B.
[email protected] wrote:

b.class
=> String

So they should have the same methods since they are the same class.

a = “foo”
=> “foo”
def a.definitely_not
“The same”
end
=> nil
a.definitely_not
=> “The same”
a.class
=> String

b = “bar”
=> “bar”
b.definitely_not
NoMethodError: undefined method `definitely_not’ for “bar”:String
from (irb):7
from :0
b.class
=> String

-greg

nicken · August 21, 2008, 12:31am

Hi –

On Thu, 21 Aug 2008, Nick B. wrote:

b.class
=> String

So they should have the same methods since they are the same class.

Not necessarily. Objects do what they do. Classes are mainly a way to
launch objects into object-space, after which they may or may not
continue to behave the way they did when they were first created.

However, “a” seems to have extra methods.
a.first
=> “hi”
b.first
NoMethodError: undefined method `first’ for “”:String

It seems the cgi object is returning strings that aren’t really strings.
If that’s the case, isn’t it a bug that cgi[‘a’].class returns “String”
when it is really something else?

No, as long as the API of the object is correctly documented.

David

nicken · August 21, 2008, 6:20am

Nick B. wrote:

behavior would have saved me a lot of frustration, and would likely do
I’m willing to except the criticism that my intuition is perverse in
some way, but when I started writing in Ruby I was really hoping it
would be a language one could use without having to understand how the C
underneath it all worked (defeating part of the purpose of “high level”
languages).

So what do you think? Would warnings in the documentation on exclamation
functions be useful or pointless?

This works. Doing it this way returns the actual string, not some
string extended by undocumented modules that’s actually an array in
disguise or something like that.

require ‘cgi’

cgi = CGI.new(‘html4’)
a = cgi.params[‘a’][0]

a.sub!(/hi/, ‘bye’)
puts a

Unintuitive language feature (exclamation functions)

------------------------------------------------------------- String#sub str.sub(pattern, replacement) => new_str str.sub(pattern) {|match| block } => new_str

------------------------------------------------------------ String#sub! str.sub!(pattern, replacement) => str or nil str.sub!(pattern) {|match| block } => str or nil

to see expected behavior, replace the above with: a = a.sub(/hi/,

to see expected behavior, replace the above with: a = a.sub(/hi/,

to see expected behavior, replace the above with: a = a.sub(/hi/,

to see expected behavior, replace the above with: a = a.sub(/hi/,

------------------------------------------------------------- String#sub
str.sub(pattern, replacement) => new_str
str.sub(pattern) {|match| block } => new_str

------------------------------------------------------------ String#sub!
str.sub!(pattern, replacement) => str or nil
str.sub!(pattern) {|match| block } => str or nil