Forum: Ruby Ruby Whitespace Semantics

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
Almann G. (Guest)
on 2006-02-27 09:45
(Received via mailing list)
Can someone please explain the semantics behind the following:

irb(main):001:0> a = ( 4 + 5 )
=> 9
irb(main):002:0> a = ( 4
irb(main):003:1>       + 5 )
=> 5
irb(main):004:0> a = ( 4 +
irb(main):005:1*       5 )
=> 9

The first and last statements make sense to me, but why is the second
one
returning 5?

I find semantics like this troubling, and no documentation sheds light
as
to what would cause this behavior.

Thanks,
Almann
Hal F. (Guest)
on 2006-02-27 09:55
(Received via mailing list)
Almann G. wrote:
>
> The first and last statements make sense to me, but why is the second one
> returning 5?
>
> I find semantics like this troubling, and no documentation sheds light as
> to what would cause this behavior.

I understand your concern. Let me try to clarify.

Expressions in Ruby can be like standalone statements. Statements are
terminated with an optional semicolon or with a newline. If a statement
is incomplete, it is understood to go on to the next line; if it is
complete, it is just as if terminated with a semicolon.

Therefore:

   a = (4
        +5)

is the same as

   a = (4;
        +5)

or even

   a = (4; +5)

That is, it evaluates a "4" and then evaluates a "+5" (which then is the
resultant value, as it was the last evaluated).

But with

   a = (4+
        5)

the parser is able to see that the expression is not complete, and is
apparently continued on the next line.
Kev J. (Guest)
on 2006-02-27 09:55
(Received via mailing list)
Almann G. wrote:

>
>
>
I'm no expert, but I think it has to do with both first and third having
the + operator on the first line and the second one having the +
operator on the second line - note also that irb understands that the
third case is a continuation of the previous line (*), but the second
case is treated almost as 2 separate statements - no (*).

I'm sure someone else will have a better idea
Kev
Almann G. (Guest)
on 2006-02-27 10:19
(Received via mailing list)
Thanks, the semantic is clearer now even though I think it is very
clumsy.

As an aside, it is interesting that Ruby allows for suites of
statements to be used in a grouping context (the parenthesis)--most
other languages that are whitespace sensitive don't allow this since it
gets confusing with expressions (as my example shows).

-Almann
Alexandru E. Ungur (Guest)
on 2006-02-27 11:26
(Received via mailing list)
> => 9
>
> The first and last statements make sense to me, but why is the second one
> returning 5?
>
> I find semantics like this troubling, and no documentation sheds light as
> to what would cause this behavior.
This behaviour is documented, at least here:
http://www.rubycentral.com/book/language.html

and probably in other placess too (but I wouldn't know about them as I
am
learning Ruby for just a little more than 1 week).


Hope it helps,
Alex
Almann G. (Guest)
on 2006-02-27 17:25
(Received via mailing list)
This behavior actually isn't well documented, since the semantic is
unclear in that documentation (or the Ruby Manual) what will happen in
the case that the grouping operator is used in this manner.

In a nutshell, newline delimits statements except when it doesn't
(trail with a binary operator for instance)... not a really good
semantic but I'll add that to the list of gotchas I need to deal with
in Ruby.

The 1.4 English Ruby Manual says:
Each expression are delimited by semicolons(;) or newlines.

The current Japanese Ruby Manual essentially says the same thing (my
Japanese isn't as sharp as it used to be):
式と式のé??はã?»ã??ã?³ã?­ã?³(;)まã?はæ?¹è¡?でå?ºå??ã??まã?
(shiki to shiki no aida wa semikoron(;)  mata wa kaigyou de kugirimasu)

-Almann
unknown (Guest)
on 2006-02-27 18:42
(Received via mailing list)
Quoting Almann G. <removed_email_address@domain.invalid>:

>
> The first and last statements make sense to me, but why is the
> second one returning 5?

a = ( 4 + 5 )
a = ( 4 ; + 5 )
a = ( 4 + 5 )

A line break begins a new statement unless there is a dangling
binary operator or line continuation.

-mental
Avdi G. (Guest)
on 2006-02-27 19:16
(Received via mailing list)
It is generally good coding style in any programming language, when
continuing an expression across more than one line, to split the lines
after
an operator or other punctuation symbol which indicates that the
expression
is unfinished.  I.e.

    somefunction( ...very long argument list,
                         more arguments)

or

   (4 +
    5)

This gives the reader a visual hint that the expression is incomplete,
and
continues onto the next line.  Ruby just uses the same heuristic that a
human reader would use.

~Avdi
Mark W. (Guest)
on 2006-02-27 21:01
(Received via mailing list)
"Hal F." <removed_email_address@domain.invalid> wrote in message
news:removed_email_address@domain.invalid...
>
> is the same as
>
>   a = (4;
>        +5)

But (4 is clearly an incomplete statement.
Jacob F. (Guest)
on 2006-02-27 21:04
(Received via mailing list)
On 2/27/06, Mark W. <removed_email_address@domain.invalid> wrote:
> >
> > is the same as
> >
> >   a = (4;
> >        +5)
>
> But (4 is clearly an incomplete statement.

Actually, notice the semicolon. What Hal is demonstrating is that a
parenthetical group can contain multiple expressions. The newline
terminates the expression '4', which is a complete expression, while
the statement (which happens to include that expression) is continued
on the next line.

Jacob F.
Anthony DeRobertis (Guest)
on 2006-02-27 21:22
(Received via mailing list)
Avdi G. wrote:

>    (4 +
>     5)
>
> This gives the reader a visual hint that the expression is incomplete,
> and
> continues onto the next line.  Ruby just uses the same heuristic that
> a human reader would use.

Well, when I work in languages other than Ruby (not in ruby, of course,
because its a syntax error), I've always found:

foo = a + b + c ...
    + z

more readable myself, but I'm probably just a weirdo.
(Guest)
on 2006-02-27 21:50
(Received via mailing list)
You're not alone. I've adopted this practice in C programs ever since I
read about it in a book called _Human Factors and Typography for More
Readable Programs_, by Ronald M. Baeker and Aaron Markus, ACM Press,
1990. The authors recommend breaking long lines at operators "of
relatively low precedence" and placing the operator at the beginning of
the second line because "it emphasizes at the beginning of the
continuation that the second line is a continuation."
unknown (Guest)
on 2006-02-27 21:56
(Received via mailing list)
Quoting Anthony DeRobertis <removed_email_address@domain.invalid>:

> Well, when I work in languages other than Ruby (not in ruby, of
> course, because its a syntax error), I've always found:
>
> foo = a + b + c ...
>     + z
>
> more readable myself, but I'm probably just a weirdo.

It works fine in Ruby if you use a backslash to continue the
statement:

 foo = a + b + c \
     + z

-mental
unknown (Guest)
on 2006-02-27 23:50
(Received via mailing list)
Ah. I thought a semicolon terminated a statement.
unknown (Guest)
on 2006-02-28 00:02
(Received via mailing list)
Quoting removed_email_address@domain.invalid:

> Ah. I thought a semicolon terminated a statement.

It does; it's just that the current Ruby grammar permits multiple
statements within the same set of parenthesis (separated by
semicolons or newlines).

-mental
Timothy G. (Guest)
on 2006-02-28 00:14
(Received via mailing list)
It does. 4 is a valid statement in Ruby. Ruby doesn't have the same
sharp division as many other languages. 4 is a statement, 4 + 5 is a
statement, and +5 is a statement. The return value of a series of
statements is the value of the final statement. As a result, the value
of '4; +5' is 5.
Martin S. Weber (Guest)
on 2006-02-28 00:33
(Received via mailing list)
On Tue, Feb 28, 2006 at 04:21:02AM +0900, Anthony DeRobertis wrote:
> (...)
> Well, when I work in languages other than Ruby (not in ruby, of course,
> because its a syntax error), I've always found:
>
> foo = a + b + c ...
>     + z

If this is M(atlab) then the three dots serve to signal
that the line is continued, i.e. serving the same purpose
as does \<nl> in ruby. I.e. you're explicitely stating already
that the expression continues on next line (just as you would
by adding a \<nl> or <operator><nl>).

-m
unknown (Guest)
on 2006-02-28 00:45
(Received via mailing list)
Quoting Timothy G. <removed_email_address@domain.invalid>:

> It does. 4 is a valid statement in Ruby. Ruby doesn't have the
> same sharp division as many other languages.

Well... not really.  There's the same distinction in Ruby too (see
expr versus compstmt in parse.y).

It's more that Ruby's grammar permits statement lists in places that
many other languages only permit expressions (like inbetween
parenthesis).

-mental
Mark W. (Guest)
on 2006-02-28 05:36
(Received via mailing list)
<removed_email_address@domain.invalid> wrote in message
news:removed_email_address@domain.invalid...
Quoting removed_email_address@domain.invalid:

>> Ah. I thought a semicolon terminated a statement.

>It does; it's just that the current Ruby grammar permits multiple
>statements within the same set of parenthesis (separated by
>semicolons or newlines).

I guess a = ( 4 ; + 5 ) is not so different from C's a = ( 4 , + 5 )

Live and learn! :)
Anthony DeRobertis (Guest)
on 2006-02-28 17:20
(Received via mailing list)
Martin S. Weber wrote:

> On Tue, Feb 28, 2006 at 04:21:02AM +0900, Anthony DeRobertis wrote:

>> foo = a + b + c ...
>>     + z
>
> If this is M(atlab) then the three dots serve to signal
> that the line is continued, i.e. serving the same purpose
> as does \<nl> in ruby.

No, that's pseudo-code and the three dots are an ellipsis indicating
elided material. Yeah, sure, it should have been U+2026 (â?¦) â?? but I'm
lazy.
Martin S. Weber (Guest)
on 2006-02-28 18:07
(Received via mailing list)
On Wed, Mar 01, 2006 at 12:18:20AM +0900, Anthony DeRobertis wrote:
>
> No, that's pseudo-code and the three dots are an ellipsis indicating
> elided material. Yeah, sure, it should have been U+2026 (?) ? but I'm
> lazy.

*snore* Sorry :)

-Martin
This topic is locked and can not be replied to.