Ruby << is ambiguos

Hi, I working on syntax highlighting for ED4W and have problem with <<
because it has three uses.

  1. << operator
  2. << HERE Doc
  3. class << Name - Singleton Class

The problem is determining if << starts a Here Doc. It is easy to handle
class<<Name is this regard, but I can’t see how you can work out << in
the following:

Ared = “mary”
angy = “bill”
angy <<Ared # does this start a Here Doc or is it operator <<
print <<Ared # as above.

Thanks.


Neville F., http://www.getsoft.com http://www.surfulater.com

Alle 21:50, domenica 28 gennaio 2007, Neville F. ha scritto:

Ared = “mary”
angy = “bill”
angy <<Ared # does this start a Here Doc or is it operator <<
print <<Ared # as above.

Thanks.


Neville F., http://www.getsoft.com http://www.surfulater.com

Some editors (kate and vim, for example) assume the presence of a
whitespace
after the << in the operator case:

a=[]
s=“abc”
#this gets highlighted as a HERE Doc
a<<s

#this gets highlighted as an operator
a<< s

Stefano

Stefano C. wrote:

Alle 21:50, domenica 28 gennaio 2007, Neville F. ha scritto:

Ared = “mary”
angy = “bill”
angy <<Ared # does this start a Here Doc or is it operator <<
print <<Ared # as above.

Thanks.


Neville F., http://www.getsoft.com http://www.surfulater.com

Some editors (kate and vim, for example) assume the presence of a
whitespace
after the << in the operator case:

a=[]
s=“abc”
#this gets highlighted as a HERE Doc
a<<s

#this gets highlighted as an operator
a<< s

Stefano

Stefano,
Thanks, but that isn’t how the Ruby interpreter works, so this is really
a hack that may or may not work with real code.

I can just do the same thing, but if possible I’d like a ‘correct’
solution.

I’ve been looking at all of the Ruby Editors on Windows and am surprised
at how poorly they handle Ruby syntax. I would never release a product
like these.

Neville F. schrieb:

Ared = “mary”
angy = “bill”
angy <<Ared # does this start a Here Doc or is it operator <<
print <<Ared # as above.

Thanks.


Neville F., http://www.getsoft.com http://www.surfulater.com

Ooops! - I Think Ruby itself has some problems with it:

irb(main):001:0> Ared = “mary”
=> “mary”
irb(main):002:0> angy = “bill”
=> “bill”
irb(main):003:0> angy <<Ared
irb(main):004:0" Ared
=> “mary”
irb(main):005:0> angy
=> “billmary”
irb(main):006:0> angy << Ared
=> “billmarymary”
irb(main):007:0> angy <<EOT
irb(main):008:0" xxxxx
irb(main):009:0" EOT
NameError: uninitialized constant EOT
from (irb):7
irb(main):010:0> angy
=> “billmarymary”
irb(main):011:0> angy << EOT
NameError: uninitialized constant EOT
from (irb):11

Wolfgang Nádasi-Donner

On Jan 28, 2007, at 3:06 PM, Neville F. wrote:

Thanks, but that isn’t how the Ruby interpreter works, so this is
really
a hack that may or may not work with real code.

You’re definitely going to need heuristics like this in syntax
highlighting Ruby source code. Only ruby can read Ruby, to borrow
the Perl expression. :wink:

James Edward G. II

Yukihiro M. schrieb:

|irb(main):007:0> angy <<EOT
|irb(main):008:0" xxxxx
|irb(main):009:0" EOT
|NameError: uninitialized constant EOT
| from (irb):7

Ruby knows angy is a local variable, so that the interpreter consider
it is more likely a shift operator than a here-doc. If you want to
disambiguate, use parentheses.

I made an error there. “angy” followed by a String doesn’t make any
sense. It
works very well if written in in a senseful way.

irb(main):001:0> angy = “abc”
=> “abc”
irb(main):002:0> angy <<<<EOT
irb(main):003:0" ddd
irb(main):004:0" EOT
=> “abcddd\n”

But… - there is one thing I really don’t understand in the first
example. Ruby
recognizes “EOT” as an uninitialized constant, because “<<” ist
interpreted as
shift operator. But why is an open here-doc string recognized in line
008:0" in
the example?

Wolfgang Nádasi-Donner

From: “Wolfgang Nádasi-Donner” [email protected]

|irb(main):007:0> angy <<EOT
|irb(main):008:0" xxxxx
|irb(main):009:0" EOT
|NameError: uninitialized constant EOT
| from (irb):7

But… - there is one thing I really don’t understand in the first example. Ruby recognizes “EOT” as an uninitialized constant,
because “<<” ist interpreted as shift operator. But why is an open here-doc string recognized in line 008:0" in the example?

Because the error happened on line 7? :slight_smile:

Regards,

Bill

Hi,

In message “Re: Ruby << is ambiguos”
on Mon, 29 Jan 2007 06:10:11 +0900, Wolfgang Nádasi-Donner
[email protected] writes:

|Ooops! - I Think Ruby itself has some problems with it:

Perhaps Ruby is smarter than you expected.

|irb(main):007:0> angy <<EOT
|irb(main):008:0" xxxxx
|irb(main):009:0" EOT
|NameError: uninitialized constant EOT
| from (irb):7

Ruby knows angy is a local variable, so that the interpreter consider
it is more likely a shift operator than a here-doc. If you want to
disambiguate, use parentheses.

          matz.

Hi –

On Mon, 29 Jan 2007, Wolfgang Nádasi-Donner wrote:

But… - there is one thing I really don’t understand in the first example.
Ruby recognizes “EOT” as an uninitialized constant, because “<<” ist
interpreted as shift operator. But why is an open here-doc string recognized
in line 008:0" in the example?

I’m not quite sure either, but it seems to interpret <<EOT as the
start of a heredoc, and then later realize that it couldn’t have been.
Compare what happens if there’s a space before EOT:

irb(main):012:0> a = 1
=> 1
irb(main):013:0> a << EOT
NameError: uninitialized constant EOT
from (irb):13

David

Thanks to everyone for their replies. It looks like the only practical
way for me handle this is to assume << is an operator if a space follows
it, otherwise as a Here Doc.

Fortunately this is the only issue I’ve come across so far re. IDE
syntax highlighting and I’m a fair way down the track.


Neville F., http://www.getsoft.com http://www.surfulater.com

Neville F. [email protected] writes:

Ared = “mary”
angy = “bill”
angy <<Ared # does this start a Here Doc or is it operator <<
print <<Ared # as above.

Thanks.

Your example is incomplete.

angy <<Ared # is an operator
print <<Ared # is a HERE doc…
some string
Ared # …because it has an end marker

That’s how the Emacs ruby-mode seems to do it.

Steve

Neville F. wrote:


Steve, in that case Emacs ruby-mode is incorrect.

angy <<Ared # is an operator
some string
Ared

gives:

C:\ruby\bin\ruby.exe D:\Ed32\BrowserTest\RubyHereDoc3.rb
D:/Ed32/BrowserTest/RubyHereDoc3.rb:3: undefined method `angy’ for
main:Object (NoMethodError)

To get this to compile, a method angry must be defined:

def angy var
end

So the only way to determine if << is a Here Doc is to know if the
preceeding token is a method. And an editor can’t know that.

PS. Also see my comment here:

http://seclib.blogspot.com/2005/11/more-on-leftshift-and-heredoc.html


Neville F., http://www.getsoft.com http://www.surfulater.com

Steven L. wrote:

Neville F. [email protected] writes:

Ared = “mary”
angy = “bill”
angy <<Ared # does this start a Here Doc or is it operator <<
print <<Ared # as above.

Thanks.

Your example is incomplete.

angy <<Ared # is an operator
print <<Ared # is a HERE doc…
some string
Ared # …because it has an end marker

That’s how the Emacs ruby-mode seems to do it.

Steve

Steve, in that case Emacs ruby-mode is incorrect.

angy <<Ared # is an operator
some string
Ared

gives:

C:\ruby\bin\ruby.exe D:\Ed32\BrowserTest\RubyHereDoc3.rb
D:/Ed32/BrowserTest/RubyHereDoc3.rb:3: undefined method `angy’ for
main:Object (NoMethodError)

To get this to compile, a method angry must be defined:

def angy var
end

So the only way to determine if << is a Here Doc is to know if the
preceeding token is a method. And an editor can’t know that.

Hi,

In message “Re: Ruby << is ambiguos”
on Tue, 30 Jan 2007 08:30:17 +0900, Neville F.
[email protected] writes:

|Steve, in that case Emacs ruby-mode is incorrect.

Maybe Ruby is too smart for this issue, since it’s difficult for
editors to determine here-doc from shift operators without serious
parsing.

          matz.

On Mon, Jan 29, 2007 at 09:55:15AM +0900, [email protected] wrote:

This message is in MIME format. The first part should be readable text,
while the remaining parts are likely unreadable without MIME-aware tools.

irb(main):003:0" ddd
Compare what happens if there’s a space before EOT:

irb(main):012:0> a = 1
=> 1
irb(main):013:0> a << EOT
NameError: uninitialized constant EOT
from (irb):13

I suspect part of the problem is that irb lexes your input to determine
the continuation prompt (and indeed if it should allow multiple lines of
input before evalling). IOW, irb has the same problem our poor text
editor’s do :slight_smile:

Yukihiro M. wrote:

Hi,

In message “Re: Ruby << is ambiguos”
on Tue, 30 Jan 2007 08:30:17 +0900, Neville F.
[email protected] writes:

|Steve, in that case Emacs ruby-mode is incorrect.

Maybe Ruby is too smart for this issue, since it’s difficult for
editors to determine here-doc from shift operators without serious
parsing.

          matz.

Matz, that’s precisely the conclusion I’ve come to. You shouldn’t make
it so hard for us folks that write editors. :slight_smile:


Neville F., http://www.getsoft.com http://www.surfulater.com

There is a similar problem with “?” and “?:”.

irb(main):001:0> ?2
=> 50
irb(main):002:0> sep = “x” * (true?2:1)
SyntaxError: compile error
(irb):2: parse error, unexpected ‘:’, expecting ‘)’
sep = “x” * (true?2:1)
^
from (irb):2
irb(main):003:0> sep = “x” * (true ?2:1)
=> “xx”

Wolfgang Nádasi-Donner

On Jan 28, 9:01 pm, Neville F. [email protected] wrote:

Thanks to everyone for their replies. It looks like the only practical
way for me handle this is to assume << is an operator if a space follows
it, otherwise as a Here Doc.

FWIW, here’s how I implemented it in the ruby mode for jEdit. It
recognizes the following as a here document:

  1. The << characters, optionally followed by a - character, followed
    by printable characters enclosed in single or double quotes. E.g.
    <<‘hello’
    <<-‘thingy67%’
    <<“foobar $”
    <<-“boofar @”

  2. The << characters, optionally followed by a - character, followed
    by letters and/or underscores. E.g.
    <<Hello_there
    <<-Howdy

Looking at this now, that second case should probably have been
letters and numbers and underscores, but not starting with a number,
i.e. a valid identifier.

From: Wolfgang Nádasi-Donner [mailto:[email protected]] :

There is a similar problem with “?” and “?:”.

irb(main):001:0> ?2

=> 50

k

irb(main):002:0> sep = “x” * (true?2:1)

SyntaxError: compile error

(irb):2: parse error, unexpected ‘:’, expecting ‘)’

sep = “x” * (true?2:1)

^

from (irb):2

careful. “?” is a valid char for identifiers and is also used for inline
if.
stupid example,

irb(main):002:0> def true?
irb(main):003:1> true
irb(main):004:1> end
=> nil
irb(main):005:0> true?
=> true
irb(main):006:0> true? 2
ArgumentError: wrong number of arguments (1 for 0)
from (irb):6:in true?' from (irb):6 irb(main):007:0> true?2 ArgumentError: wrong number of arguments (1 for 0) from (irb):7:in true?’
from (irb):7
irb(main):008:0> true?2:1
SyntaxError: compile error
(irb):8: parse error, unexpected ‘:’, expecting $
true?2:1
^
from (irb):8
irb(main):009:0> true?2
ArgumentError: wrong number of arguments (1 for 0)
from (irb):9:in `true?’
from (irb):9
irb(main):011:0> true? 2:1
SyntaxError: compile error
(irb):11: parse error, unexpected ‘:’, expecting $
true? 2:1
^
from (irb):11
irb(main):012:0> true? ? 2:1
=> 2
irb(main):013:0> true ? 2:1
=> 2

irb(main):003:0> sep = “x” * (true ?2:1)

=> “xx”

k.

kind regards -botp

Peña schrieb:

There is a similar problem with “?” and “?:”.

careful. “?” is a valid char for identifiers and is also used for inline if.
stupid example,

This means, it is even more complex to write a code highlighter without
analysing semantics.

Wolfgang Nádasi-Donner