Strings combine

Shouldn’t the following be a syntax error?

’ 3’ ‘4’
=> " 34"

?
-rp

On Feb 18, 2010, at 12:54 PM, Roger P. wrote:

Shouldn’t the following be a syntax error?

’ 3’ ‘4’
=> " 34"

Rarely seen, but string literal “implicit” concatenation has always been
a feature.

Gary W.

On Thu, Feb 18, 2010 at 11:34 AM, Gary W. [email protected] wrote:

Rarely seen, but string literal “implicit” concatenation has always been a
feature.

I remember at one point I tried Python-style triple quotes, all “cool,
Ruby
supports that too!”

Except it doesn’t… :confused:

Shouldn’t the following be a syntax error?

’ 3’ ‘4’
=> " 34"

Rarely seen, but string literal “implicit” concatenation has always been
a feature.

Yeah–for me it has almost always represented a bug (like a missing ,
between parameters or what not).
Ahh well.
-rp

This doesn’t work if you assign the strings to variables though:

'hi ’ ‘there’
=>“hi there”

hi = 'hi ’
there = ‘there’
hi there
NoMethodError: undefined method hi' for main:Object from (irb):9 from /opt/local/bin/irb:12:in

Which is a little counter intuitive.

On 02/18/2010 11:32 PM, Roger P. wrote:

Ahh well.
I can’t remember having used it but it can be useful if you want to
create a longer string and do not want to use a here document (because
of indentation issues for example).

Kind regards

robert

On Fri, Feb 19, 2010 at 10:07 PM, Robert K.
[email protected] wrote:

from (irb):9
from /opt/local/bin/irb:12:in `’

Which is a little counter intuitive.

Not for me. The concatenation is done at parse time. Your
“whitespace concatenation” is done at runtime.

as long as it’s quoted string literal,
try it like,

x=“robert”
=> “robert”
y=“raul”
=> “raul”
def m
“botp”
end
=> nil
“#{x}”
?> “#{y}” “#{m}”
=> “robertraulbotp”

i think i remember a use case for this when i tried modifying a source
without using an editor, …but i still have to search for that script
yet --if it’s still on my newer disks…

best regards -botp

2010/2/19 Raul J. [email protected]:

from /opt/local/bin/irb:12:in `’

Which is a little counter intuitive.

Not for me. The concatenation is done at parse time. Your
“whitespace concatenation” is done at runtime.

Kind regards

robert

Robert K. wrote:

2010/2/19 Raul J. [email protected]:

�from /opt/local/bin/irb:12:in `’

Which is a little counter intuitive.

Not for me. The concatenation is done at parse time. Your
“whitespace concatenation” is done at runtime.

Kind regards

robert

I guess I find it counter intuitive to have such different behaviors
parse time vs. run time. If it isn’t suppose to be a behavior of
strings that you can stick one next to another and have them
concatenate, then my brain has a hard time understanding why the parser
should treat them as though they do have that behavior?

Robert@babelfish ~
$

There is no way the parser can disambiguate concatenation of strings
referenced through variables and method invocations whereas a
concatenation of string literals is easily detectable.

Apart from that, I believe it is quite common in programming languages
to allow concatenation of string literals although that feature might be
rarely used.

Kind regards

robert

I think you misunderstand what I was saying. I was not asking what the
differences between the parser and the runtime are. I was asking for a
rationale. If Matz believed that strings, when placed together without
any punctuation between them should be concatenated, why didn’t he make
it standard operating behavior? I can’t imagine it would have been so
difficult for when the parser sees two variables in a row to treat it as
equivalent to a concatenation call on the first variable with the second
variable as an argument. If Matz didn’t think that two strings should
be concatenated, why program that behavior into the parser?

It seems to me like the rationale behind not having a standard method
call (like concatenation) for two variables appearing next to each other
is that you feel there is some value in having the parser register that
syntax as an error (as opposed to having the runtime display a no method
error). But if you feel that two things next to each other should
trigger alarm bells, why not also with two strings.

[“one” “two”, “three”]

seems like it should really throw an error to me. If I really wanted
[“onetwo”, “three”] I would have written “onetwo” as a single string.

I have a really hard time coming up with cases where writing two strings
next to each other in code is preferable to writing a single longer
string. The single string uses fewer characters, even. I have a
remarkably easy time coming up with cases where someone forgets to type
a comma.

On 19.02.2010 17:40, Raul J. wrote:

I guess I find it counter intuitive to have such different behaviors
parse time vs. run time.

Why? Parsing and executing are two fundamentally different things.

If it isn’t suppose to be a behavior of
strings that you can stick one next to another and have them
concatenate, then my brain has a hard time understanding why the parser
should treat them as though they do have that behavior?

At least because of ambiguity: this line is a valid method invocation:

Robert@babelfish ~
$ ruby19 -ce ‘foo bar’
Syntax OK

Robert@babelfish ~
$ ruby19 -e ‘foo bar’
-e:1:in <main>': undefined local variable or methodbar’ for
main:Object (NameError)

Robert@babelfish ~
$

There is no way the parser can disambiguate concatenation of strings
referenced through variables and method invocations whereas a
concatenation of string literals is easily detectable.

Apart from that, I believe it is quite common in programming languages
to allow concatenation of string literals although that feature might be
rarely used.

Kind regards

robert

I think you misunderstand what I was saying. I was not asking what the
differences between the parser and the runtime are. I was asking for a
rationale. If Matz believed that strings, when placed together without
any punctuation between them should be concatenated, why didn’t he make
it standard operating behavior? I can’t imagine it would have been so
difficult for when the parser sees two variables in a row to treat it as
equivalent to a concatenation call on the first variable with the second
variable as an argument. If Matz didn’t think that two strings should
be concatenated, why program that behavior into the parser?

Yeah it ends up being pretty hard to generate a parser that will do
both.
-rp

Roger P. wrote:

I think you misunderstand what I was saying. I was not asking what the
differences between the parser and the runtime are. I was asking for a
rationale. If Matz believed that strings, when placed together without
any punctuation between them should be concatenated, why didn’t he make
it standard operating behavior? I can’t imagine it would have been so
difficult for when the parser sees two variables in a row to treat it as
equivalent to a concatenation call on the first variable with the second
variable as an argument. If Matz didn’t think that two strings should
be concatenated, why program that behavior into the parser?

Yeah it ends up being pretty hard to generate a parser that will do
both.
-rp

I’m not sure why, though. The parser is able to turn

x + y

into

x.+(y)

What would be so hard about having the parser turn:

x y

into

x.+(y)

?

Or if you only think strings should work that way, and not numbers:

x.concat(y)

On 19 February 2010, Raul J. wrote:

[“one” “two”, “three”]

seems like it should really throw an error to me.

+1, it should at least raise a warning. I never saw any use case of
this.
Better to probably remove it, no?

Regards,
B.D.

On 19.02.2010 21:24, Justin C. wrote:

variable as an argument. If Matz didn’t think that two strings should

x y

already means

x(y)

Exactly my point.

robert

Raul J. wrote:

be concatenated, why program that behavior into the parser?
into
x.+(y)

?

Or if you only think strings should work that way, and not numbers:

x.concat(y)

Because

x y

already means

x(y)

-Justin

Justin C. wrote:

Raul J. wrote:

be concatenated, why program that behavior into the parser?
into
x.+(y)

?

Or if you only think strings should work that way, and not numbers:

x.concat(y)

Because

x y

already means

x(y)

-Justin

But the parser knows what’s a method and what’s a variable, right? (I’m
asking. I assume so but I really don’t know.) If it knows what is a
variable and what is a method, it can define x y as x(y) when x is a
method, and x y as x.+(y) when x is a variable. Right?

I’m not trying to prove a point, or anything. I’m just trying to
understand if there is a rationale behind this decision that I’m not
seeing.

On Sat, Feb 20, 2010 at 09:59:54AM +0900, Rick DeNatale wrote:

Second, variables in Ruby are untyped references to objects, and not
objects themselves.

I think this is really the key here. With this code, we have two
literal
objects:

'foo' 'bar'

With this code, however, we have references to objects:

a = 'foo'
b = 'bar'
a b

. . . which would be why they behave differently in terms of the
principles of the Ruby language’s design.

That . . . or it’s an accident of implementation. I guess the question
then is whether the fact that references to string objects behave
differently from literal string objects in this case is the result of a
conscious decision or just an emergent property of the implementation.

On Fri, Feb 19, 2010 at 5:02 PM, Raul J. [email protected]
wrote:

But the parser knows what’s a method and what’s a variable, right? (I’m
asking. I assume so but I really don’t know.) If it knows what is a
variable and what is a method, it can define x y as x(y) when x is a
method, and x y as x.+(y) when x is a variable. Right?

No.

First -

In the expression

x y

It sees x as a method because of the form of the expression. Just as
it does for the expressions

x(y)

or

self.x y

and if we have:

a = ‘x’
b = ‘y’

a b

is the same as:

self.a b

not

‘x’ b

def a(arg)
“My arg is #{arg}”
end

a = ‘string 1’
b = ‘string 2’

a b # => “My arg is string 2”

Second, variables in Ruby are untyped references to objects, and not
objects themselves.

So what would the compiler to do with

def two_args(x, y)
x y
end

two_args(‘a’, ‘b’)

two_args(1, 2)

Third,

the interpretation of

‘a’ ‘b’

as ‘ab’

Is not really a parser thing, it’s a lexical analyser thing.

It’s just a rarely used form of string literal which allows for quoted
strings separated by whitespace (or escaped newlines) to be coalesced
before the parser even sees them.


Rick DeNatale

Blog: http://talklikeaduck.denhaven2.com/
Twitter: http://twitter.com/RickDeNatale
WWR: http://www.workingwithrails.com/person/9021-rick-denatale
LinkedIn: Rick DeNatale - Developer - IBM | LinkedIn

Rick Denatale wrote:

In the expression

x y

It sees x as a method because of the form of the expression.

And to prove this:

irb(main):001:0> x y
NameError: undefined local variable or method `y’ for main:Object
from (irb):1

That is: ruby has already decided from parsing that x could only be a
method name. Actually x is neither a method nor a local variable right
now, but when parsing it doesn’t know this. So it’s parsed as x(y), and
the first thing it errors on when trying to execute that expression is
that y is undefined, as you evaluate all the args before performing the
method call.

irb(main):002:0> y = nil
=> nil
irb(main):003:0> x y
NoMethodError: undefined method `x’ for main:Object
from (irb):3

So:

x y is unambiguously saying x is a method, y is arg
self.x is unambiguously saying that x is a method
x() is unambiguously saying that x is a method

The only time where ruby has to probe further is an expression like
this:

x

Then it looks to see if there has been a previous expression of the form
x = … in the current scope. If so, x is a local variable. If not, x is
a method.

So the current rules let us know unambigously that

puts “hello”

is always a method call, without having to think any further. Even the
following oddball example works in a fairly understandable way:

puts = “abc”
puts “hello”
puts puts

If you decided to change the parsing just to allow concatenation of
adjacent items it would become horrendously complicated to read ruby
source. For example, the line

puts “hello”

might be a method call, or it might be concatenation, depending on
whether puts was a local variable or not. That means you couldn’t
understand any source without continually looking back at the previous
lines in the method.

In any case, just look at the C language for an example where
concatenating adjacent string literals is implemented, but it would
make no sense at all to concatenate two adjacent expressions, because
the language doesn’t even have a string concatentation operator.

char *c0 = “abc” “def”; // 7 bytes, abcdef\0

I suspect this is where Ruby inherited this feature from.