Ruby Whitespace Semantics


#1

Can someone please explain the semantics behind the following:

irb(main):001:0> a = ( 4 + 5 )
=> 9
irb(main):002:0> a = ( 4
irb(main):003:1> + 5 )
=> 5
irb(main):004:0> a = ( 4 +
irb(main):005:1* 5 )
=> 9

The first and last statements make sense to me, but why is the second
one
returning 5?

I find semantics like this troubling, and no documentation sheds light
as
to what would cause this behavior.

Thanks,
Almann


#2

Almann G. wrote:

The first and last statements make sense to me, but why is the second one
returning 5?

I find semantics like this troubling, and no documentation sheds light as
to what would cause this behavior.

I understand your concern. Let me try to clarify.

Expressions in Ruby can be like standalone statements. Statements are
terminated with an optional semicolon or with a newline. If a statement
is incomplete, it is understood to go on to the next line; if it is
complete, it is just as if terminated with a semicolon.

Therefore:

a = (4
+5)

is the same as

a = (4;
+5)

or even

a = (4; +5)

That is, it evaluates a “4” and then evaluates a “+5” (which then is the
resultant value, as it was the last evaluated).

But with

a = (4+
5)

the parser is able to see that the expression is not complete, and is
apparently continued on the next line.


#3

Almann G. wrote:

I’m no expert, but I think it has to do with both first and third having
the + operator on the first line and the second one having the +
operator on the second line - note also that irb understands that the
third case is a continuation of the previous line (), but the second
case is treated almost as 2 separate statements - no (
).

I’m sure someone else will have a better idea
Kev


#4

=> 9

The first and last statements make sense to me, but why is the second one
returning 5?

I find semantics like this troubling, and no documentation sheds light as
to what would cause this behavior.
This behaviour is documented, at least here:
http://www.rubycentral.com/book/language.html

and probably in other placess too (but I wouldn’t know about them as I
am
learning Ruby for just a little more than 1 week).

Hope it helps,
Alex


#5

Thanks, the semantic is clearer now even though I think it is very
clumsy.

As an aside, it is interesting that Ruby allows for suites of
statements to be used in a grouping context (the parenthesis)–most
other languages that are whitespace sensitive don’t allow this since it
gets confusing with expressions (as my example shows).

-Almann


#6

Quoting Almann G. removed_email_address@domain.invalid:

The first and last statements make sense to me, but why is the
second one returning 5?

a = ( 4 + 5 )
a = ( 4 ; + 5 )
a = ( 4 + 5 )

A line break begins a new statement unless there is a dangling
binary operator or line continuation.

-mental


#7

This behavior actually isn’t well documented, since the semantic is
unclear in that documentation (or the Ruby Manual) what will happen in
the case that the grouping operator is used in this manner.

In a nutshell, newline delimits statements except when it doesn’t
(trail with a binary operator for instance)… not a really good
semantic but I’ll add that to the list of gotchas I need to deal with
in Ruby.

The 1.4 English Ruby Manual says:
Each expression are delimited by semicolons(:wink: or newlines.

The current Japanese Ruby Manual essentially says the same thing (my
Japanese isn’t as sharp as it used to be):
式と式のé??はã?»ã??ã?³ã?­ã?³(;)まã?はæ?¹è¡?でå?ºå??ã??まã?
(shiki to shiki no aida wa semikoron(:wink: mata wa kaigyou de kugirimasu)

-Almann


#8

It is generally good coding style in any programming language, when
continuing an expression across more than one line, to split the lines
after
an operator or other punctuation symbol which indicates that the
expression
is unfinished. I.e.

somefunction( ...very long argument list,
                     more arguments)

or

(4 +
5)

This gives the reader a visual hint that the expression is incomplete,
and
continues onto the next line. Ruby just uses the same heuristic that a
human reader would use.

~Avdi


#9

On 2/27/06, Mark W. removed_email_address@domain.invalid wrote:

is the same as

a = (4;
+5)

But (4 is clearly an incomplete statement.

Actually, notice the semicolon. What Hal is demonstrating is that a
parenthetical group can contain multiple expressions. The newline
terminates the expression ‘4’, which is a complete expression, while
the statement (which happens to include that expression) is continued
on the next line.

Jacob F.


#10

“Hal F.” removed_email_address@domain.invalid wrote in message
news:removed_email_address@domain.invalid…

is the same as

a = (4;
+5)

But (4 is clearly an incomplete statement.


#11

You’re not alone. I’ve adopted this practice in C programs ever since I
read about it in a book called Human Factors and Typography for More
Readable Programs
, by Ronald M. Baeker and Aaron Markus, ACM Press,
1990. The authors recommend breaking long lines at operators “of
relatively low precedence” and placing the operator at the beginning of
the second line because “it emphasizes at the beginning of the
continuation that the second line is a continuation.”


#12

Quoting Anthony DeRobertis removed_email_address@domain.invalid:

Well, when I work in languages other than Ruby (not in ruby, of
course, because its a syntax error), I’ve always found:

foo = a + b + c …
+ z

more readable myself, but I’m probably just a weirdo.

It works fine in Ruby if you use a backslash to continue the
statement:

foo = a + b + c
+ z

-mental


#13

Ah. I thought a semicolon terminated a statement.


#14

Avdi G. wrote:

(4 +
5)

This gives the reader a visual hint that the expression is incomplete,
and
continues onto the next line. Ruby just uses the same heuristic that
a human reader would use.

Well, when I work in languages other than Ruby (not in ruby, of course,
because its a syntax error), I’ve always found:

foo = a + b + c …
+ z

more readable myself, but I’m probably just a weirdo.


#15

It does. 4 is a valid statement in Ruby. Ruby doesn’t have the same
sharp division as many other languages. 4 is a statement, 4 + 5 is a
statement, and +5 is a statement. The return value of a series of
statements is the value of the final statement. As a result, the value
of ‘4; +5’ is 5.


#16

Quoting removed_email_address@domain.invalid:

Ah. I thought a semicolon terminated a statement.

It does; it’s just that the current Ruby grammar permits multiple
statements within the same set of parenthesis (separated by
semicolons or newlines).

-mental


#17

On Tue, Feb 28, 2006 at 04:21:02AM +0900, Anthony DeRobertis wrote:

(…)
Well, when I work in languages other than Ruby (not in ruby, of course,
because its a syntax error), I’ve always found:

foo = a + b + c …
+ z

If this is M(atlab) then the three dots serve to signal
that the line is continued, i.e. serving the same purpose
as does <nl> in ruby. I.e. you’re explicitely stating already
that the expression continues on next line (just as you would
by adding a <nl> or ).

-m


#18

Quoting Timothy G. removed_email_address@domain.invalid:

It does. 4 is a valid statement in Ruby. Ruby doesn’t have the
same sharp division as many other languages.

Well… not really. There’s the same distinction in Ruby too (see
expr versus compstmt in parse.y).

It’s more that Ruby’s grammar permits statement lists in places that
many other languages only permit expressions (like inbetween
parenthesis).

-mental


#19

Martin S. Weber wrote:

On Tue, Feb 28, 2006 at 04:21:02AM +0900, Anthony DeRobertis wrote:

foo = a + b + c …
+ z

If this is M(atlab) then the three dots serve to signal
that the line is continued, i.e. serving the same purpose
as does <nl> in ruby.

No, that’s pseudo-code and the three dots are an ellipsis indicating
elided material. Yeah, sure, it should have been U+2026 (â?¦) â?? but I’m
lazy.


#20

On Wed, Mar 01, 2006 at 12:18:20AM +0900, Anthony DeRobertis wrote:

No, that’s pseudo-code and the three dots are an ellipsis indicating
elided material. Yeah, sure, it should have been U+2026 (?) ? but I’m
lazy.

snore Sorry :slight_smile:

-Martin