Forum: Ruby-core Improving 'syntax error, unexpected $end, expecting kEND'?

457cf540784a12ba2f30e06565a2c189?d=identicon&s=25 Hugh Sasse (Guest)
on 2007-10-18 20:02
(Received via mailing list)
I've had a look at this, but can't see how to do it: When I get
  syntax error, unexpected $end, expecting kEND
I know from experience that $end means "end of file" and kEND means
the lexical token "end".  What I can't figure out from reading
parse.c is how the stack works and how the rules are stored. I think
knowing if it was expecting the 'end' from a 'class', 'def', 'begin'
'while', 'for', 'do', 'if', 'unless' or 'case' would help me narrow
down where I've failed to have a closing "end".  If there's line
number information that would be even better.  Indentation has not
solved this for me.

So is this too difficult, which is why is hasn't happened already?

Meanwhile I'll just break up my program into smaller bits.

        Thank you,
        Hugh
Fe6a008c1e3065327d1f1b007d8f1362?d=identicon&s=25 Paul Brannan (cout)
on 2007-10-22 23:21
(Received via mailing list)
On Fri, Oct 19, 2007 at 03:01:55AM +0900, Hugh Sasse wrote:
> I've had a look at this, but can't see how to do it: When I get
>   syntax error, unexpected $end, expecting kEND
> I know from experience that $end means "end of file" and kEND means
> the lexical token "end".  What I can't figure out from reading

IMO this is a pretty cryptic message.  It gets easier to diagnose the
more you see the message, but I'd rather see a message like:

  Syntax error: unexpected end of file while looking for matching 'end'

> parse.c is how the stack works and how the rules are stored. I think
> knowing if it was expecting the 'end' from a 'class', 'def', 'begin'
> 'while', 'for', 'do', 'if', 'unless' or 'case' would help me narrow
> down where I've failed to have a closing "end".  If there's line
> number information that would be even better.  Indentation has not
> solved this for me.

Indentation may not solve the problem, but it does help.

When all else fails I comment out sections of code until the code
compiles; when that happens, I know the offending code is probably in
the code block I most recently commented out.

I don't think that specifying which element was missing the end
necessarily helps track down the problem.  Consider:

  class Foo
  def foo
  while true
  end
  return 42
  end
  1+2

With your proposed change I might expect to see something like:

  Syntax error: unexpected end of file; expecting matching 'end' for
'class Foo'

but blindly putting an 'end' after bar() might be wrong; the user might
have meant either:

  class Foo
    def foo
      while true
      end
      return 42
    end
    bar()
  end

or:

  class Foo
    def foo
      while true
      end
      return 42
    end
  end
  bar()

but it's impossible to know without probing the user's brain (though
analyzing indentation could possibly help).

Paul
F889bf17449ffbf62345d2b2d316a937?d=identicon&s=25 Michal Suchanek (Guest)
on 2007-10-23 13:06
(Received via mailing list)
On 22/10/2007, Paul Brannan <pbrannan@atdesk.com> wrote:

>       return 42
>       return 42
>     end
>   end
>   bar()
>
> but it's impossible to know without probing the user's brain (though
> analyzing indentation could possibly help).
>

Yes, but the fact that you have unclosed class Foo hints quite a bit
where the error might be. If you have some testing code below the
class you can be quite sure that the missing end is inside the class
definition. If you have multiple classes you would get nested class
warning (or not) which should help you tell which one is broken. Even
then, saying which class is not closed is better than not giving  a
warning about nested classes - not everybody knows that.
Of course, the error might be in quite surprising places. But stating
what is the top unclosed element makes it certainly easier to
diagnose.

Thanks

Michal
8f6f95c4bd64d5f10dfddfdcd03c19d6?d=identicon&s=25 Rick Denatale (rdenatale)
on 2007-10-23 14:42
(Received via mailing list)
On 10/22/07, Paul Brannan <pbrannan@atdesk.com> wrote:

>
>       end
>       end
>       return 42
>     end
>   end
>   bar()
>
> but it's impossible to know without probing the user's brain (though
> analyzing indentation could possibly help).

Way back in my college days (I'm amazed that my senility hasn't
advanced to the point where I can't remember that far back<g>) I took
a programming course which used a slightly simplified academic version
of PL/1 from Cornell called PL/C.

This was back in the day where you submitted batch jobs, and got a
printout to study and debug.

The PL/C compiler tried as hard as it could to correct syntax errors
to get the "most" out of each run.

It would put out messages like:

 20:   A line with a syntax error
          SYNTAX ERROR ON LINE 20 ....
         PL/C USES:
         pl/c's guess of what you meant.

More often that not, IIRC this produced more amusing cascading errors
than a real solution.


--
Rick DeNatale

My blog on Ruby
http://talklikeaduck.denhaven2.com/
457cf540784a12ba2f30e06565a2c189?d=identicon&s=25 Hugh Sasse (Guest)
on 2007-10-23 17:31
(Received via mailing list)
On Tue, 23 Oct 2007, Rick DeNatale wrote:

> On 10/22/07, Paul Brannan <pbrannan@atdesk.com> wrote:
>
> > I don't think that specifying which element was missing the end
> > necessarily helps track down the problem.  Consider:
> >
   [class Foo;def foo;while true;end;return 42;end;1+2]
> >
> > With your proposed change I might expect to see something like:
> >
> >   Syntax error: unexpected end of file; expecting matching 'end' for 'class Foo'
> >
> > but blindly putting an 'end' after bar() might be wrong; the user might
> > have meant either:

Agreed. Doing anything without consideration will cause problems.  But
at least it tells the user where the interpreter began to get in knots.
        [examples trimmed]
> > but it's impossible to know without probing the user's brain (though
> > analyzing indentation could possibly help).
>
> Way back in my college days ([...]) I took
> a programming course which used a slightly simplified academic version
> of PL/1 from Cornell called PL/C.
        [...]
> The PL/C compiler tried as hard as it could to correct syntax errors
> to get the "most" out of each run.
        [...]
> More often that not, IIRC this produced more amusing cascading errors
> than a real solution.

Agreed it is a less than perfect solution, but in terms of figuring
out what the interpreter is doing, it is more diagnostic info than
we get now.  We know that the open and closing statements are
balanced between the statement after that given and end of file.

        Hugh
1018fce89dab3bedcc380fe37d90c9c0?d=identicon&s=25 David Flanagan (Guest)
on 2007-10-23 21:15
(Received via mailing list)
The patch below changes this message to:

syntax error, unexpected "end-of-file", expecting "end"

It doesn't tell you where the outermost open block begins, but at least
the message isn't so cryptic.  I don't know any way to get rid of the
quotes around "end-of-file".

  David

Index: parse.y
===================================================================
--- parse.y  (revision 13760)
+++ parse.y  (working copy)
@@ -596,6 +596,7 @@
  /*%
  %token <val>
  %*/
+        end_of_file 0 "end-of-file"
    keyword_class
    keyword_module
    keyword_def
@@ -603,7 +604,7 @@
    keyword_begin
    keyword_rescue
    keyword_ensure
-  keyword_end
+  keyword_end "end"
    keyword_if
    keyword_unless
    keyword_then
F52e87b92cafb1e8c6d155076b56ecff?d=identicon&s=25 Martin Duerst (Guest)
on 2007-10-24 08:58
(Received via mailing list)
At 04:15 07/10/24, David Flanagan wrote:
>The patch below changes this message to:
>
>syntax error, unexpected "end-of-file", expecting "end"
>
>It doesn't tell you where the outermost open block begins, but at least the message isn't 
so cryptic.  I don't know any way to get rid of the quotes around "end-of-file".

This looks extremely helpful, in particular for (relative) beginners.
Looking at the patch below, it seems to be easy to improve things for
a few other cases, too. I just tried to fill in the blanks; maybe
some of this stuff doesn't make sense, but I have included a patch
below.

As for personal experiences, when I get an error of this kind,
I often add an "end" at some arbitrary place and see what happens.
Then I move that "end" around a bit and see again what happens.
That usually leads to a solution pretty quickly. Seems that sometimes
randomized algorithms are useful even for programming :-).

Regards,    Martin.


Index: parse.y
===================================================================
--- parse.y     (revision 13764)
+++ parse.y     (working copy)
@@ -596,54 +596,55 @@
 /*%
 %token <val>
 %*/
-       keyword_class
-       keyword_module
-       keyword_def
-       keyword_undef
-       keyword_begin
-       keyword_rescue
-       keyword_ensure
-       keyword_end
-       keyword_if
-       keyword_unless
-       keyword_then
-       keyword_elsif
-       keyword_else
-       keyword_case
-       keyword_when
-       keyword_while
-       keyword_until
-       keyword_for
-       keyword_break
-       keyword_next
-       keyword_redo
-       keyword_retry
-       keyword_in
-       keyword_do
-       keyword_do_cond
-       keyword_do_block
+       end_of_file 0   "end-of-file"
+       keyword_class   "class"
+       keyword_module  "module"
+       keyword_def     "def"
+       keyword_undef   "undef"
+       keyword_begin   "begin"
+       keyword_rescue  "rescue"
+       keyword_ensure  "ensure"
+       keyword_end     "end"
+       keyword_if      "if"
+       keyword_unless  "unless"
+       keyword_then    "then"
+       keyword_elsif   "elsif"
+       keyword_else    "else"
+       keyword_case    "case"
+       keyword_when    "when"
+       keyword_while   "while"
+       keyword_until   "until"
+       keyword_for     "for"
+       keyword_break   "break"
+       keyword_next    "next"
+       keyword_redo    "redo"
+       keyword_retry   "retry"
+       keyword_in      "in"
+       keyword_do      "do"
+       keyword_do_cond "do"
+       keyword_do_block        "do"
        keyword_do_LAMBDA
-       keyword_return
-       keyword_yield
-       keyword_super
-       keyword_self
-       keyword_nil
-       keyword_true
-       keyword_false
-       keyword_and
-       keyword_or
-       keyword_not
-       modifier_if
-       modifier_unless
-       modifier_while
-       modifier_until
-       modifier_rescue
-       keyword_alias
-       keyword_defined
-       keyword_BEGIN
-       keyword_END
-       keyword__LINE__
-       keyword__FILE__
+       keyword_return  "return"
+       keyword_yield   "yield"
+       keyword_super   "super"
+       keyword_self    "self"
+       keyword_nil     "nil"
+       keyword_true    "true"
+       keyword_false   "false"
+       keyword_and     "and"
+       keyword_or      "or"
+       keyword_not     "not"
+       modifier_if     "if"
+       modifier_unless "unless"
+       modifier_while  "while"
+       modifier_until  "until"
+       modifier_rescue "rescue"
+       keyword_alias   "alias"
+       keyword_defined "defined"
+       keyword_BEGIN   "BEGIN"
+       keyword_END     "END"
+       keyword__LINE__ "__LINE__"
+       keyword__FILE__ "__FILE__"

 %token <id>   tIDENTIFIER tFID tGVAR tIVAR tCONSTANT tCVAR tLABEL
 %token <node> tINTEGER tFLOAT tSTRING_CONTENT tCHAR



#-#-#  Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University
#-#-#  http://www.sw.it.aoyama.ac.jp       mailto:duerst@it.aoyama.ac.jp
1018fce89dab3bedcc380fe37d90c9c0?d=identicon&s=25 David Flanagan (Guest)
on 2007-10-24 09:58
(Received via mailing list)
Thanks for filling these in Martin.  I worry that this is such a simple
thing that there must be some reason it wasn't done before....  Impact
on performance?  Breaking the Ripper code?

It occurs to me that maybe the end-of-file line should just be

   end_of_file 0

instead of

   end_of_file 0 "end-of-file"

that ought to get rid of the misleading quotes that make "end-of-file"
appear like a keyword.  I tried using "EOF" as the token name, but there
was already a macro with that name and that messed everything up.

  David
F52e87b92cafb1e8c6d155076b56ecff?d=identicon&s=25 Martin Duerst (Guest)
on 2007-10-25 09:05
(Received via mailing list)
At 16:57 07/10/24, David Flanagan wrote:
>Thanks for filling these in Martin.

I'm sorry, but I should have compiled these earlier.
I actually get some warnings because some strings (e.g. "if", "do")
appear multiple times.

>I worry that this is such a simple thing that there must be some reason it wasn't done 
before....  Impact on performance?  Breaking the Ripper code?

One reason I suspect is yacc. As an example,
http://dinosaur.compilertools.net/yacc/index.html,
which gives explicit yacc syntax in appendix C, doesn't mention
this facility.

Regards,    Martin.

>       David
#-#-#  Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University
#-#-#  http://www.sw.it.aoyama.ac.jp       mailto:duerst@it.aoyama.ac.jp
Cd49db0b676767ea4358b1047c4cddd2?d=identicon&s=25 Robin Stocker (Guest)
on 2007-10-26 00:03
(Received via mailing list)
Martin Duerst schrieb:
> At 16:57 07/10/24, David Flanagan wrote:
>
>> I worry that this is such a simple thing that there must be some reason it wasn't done 
before....  Impact on performance?  Breaking the Ripper code?
>
> One reason I suspect is yacc. As an example,
> http://dinosaur.compilertools.net/yacc/index.html,
> which gives explicit yacc syntax in appendix C, doesn't mention
> this facility.

Hi all,

I made an attempt to fix this issue one and a half year ago:

http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/...

Here's a summary:

- Only Bison understands the syntax with the descriptive name in quotes.
- So I made a patch to the Makefile which strips the "foo" from parse.y
if we aren't using Bison and then feeds that to the parser generator.
- The patch used GNU make extensions which wasn't acceptable, so I gave
it another try with standard shell features:

http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/...

- There was no response, I got discouraged and gave up.

Now that there is interest in it again, maybe it can be resolved once
and for all. Would it help if I updated my old patch?

By the way, I can't understand why the core developers apparently aren't
interested in fixing this wart.

Regards,
  Robin Stocker
F1d6cc2b735bfd82c8773172da2aeab9?d=identicon&s=25 Nobuyoshi Nakada (nobu)
on 2007-10-26 03:30
(Received via mailing list)
Hi,

At Fri, 26 Oct 2007 07:01:53 +0900,
Robin Stocker wrote in [ruby-core:12946]:
> - The patch used GNU make extensions which wasn't acceptable, so I gave
> it another try with standard shell features:
>
> http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/...
>
> - There was no response, I got discouraged and gave up.

Sorry, I've missed it.

> Now that there is interest in it again, maybe it can be resolved once
> and for all. Would it help if I updated my old patch?

In 1.9, bison is now required for ripper, so it won't be necessary to
strip them.
1018fce89dab3bedcc380fe37d90c9c0?d=identicon&s=25 David Flanagan (Guest)
on 2007-10-26 08:10
(Received via mailing list)
Nobuyoshi Nakada wrote:
>
> Sorry, I've missed it.
>
>> Now that there is interest in it again, maybe it can be resolved once
>> and for all. Would it help if I updated my old patch?
>
> In 1.9, bison is now required for ripper, so it won't be necessary to
> strip them.
>

Well, then.  Attached is an updated version of Robin's patch.  I made
some small changes to some of the token names Robin had picked.  In
particular, I avoided ambiguity when two tokens represented different
uses of the same keyword by adding spaces.  E.g. "if" is the if
statement and "if " is the if modifier. bison won't let me use the same
string for both tokens, but I suspect that simpler is better in error
messages.  Also, added a new end_of_file token hardcoded as token 0
which is what bison seems to use.  I didn't give this one a
double-quoted name because the double quotes get included in the error
messages (with my version of bison at least) and I felt that this:

    unexpected end_of_file

is clearer than:

    unexpected "end of file"

This seems to work okay.  make test works, except for tests in
bootstraptest/test_syntax.rb which explicitly test the content of error
messages.  If this patch is accepted, I'll patch the test cases to
match.

  David

  David
Cd49db0b676767ea4358b1047c4cddd2?d=identicon&s=25 Robin Stocker (Guest)
on 2007-10-26 14:37
(Received via mailing list)
David Flanagan schrieb:
>>> - There was no response, I got discouraged and gave up.
>>
>> Sorry, I've missed it.
>>
>>> Now that there is interest in it again, maybe it can be resolved once
>>> and for all. Would it help if I updated my old patch?
>>
>> In 1.9, bison is now required for ripper, so it won't be necessary to
>> strip them.

Thanks Nobu, that's good news.

> Well, then.  Attached is an updated version of Robin's patch.  I made
> some small changes to some of the token names Robin had picked.  In
> particular, I avoided ambiguity when two tokens represented different
> uses of the same keyword by adding spaces.  E.g. "if" is the if
> statement and "if " is the if modifier. bison won't let me use the same
> string for both tokens, but I suspect that simpler is better in error
> messages.

How do the error messages look for these cases? Are the spaces preserved
and do they show up in the messages?
1018fce89dab3bedcc380fe37d90c9c0?d=identicon&s=25 David Flanagan (Guest)
on 2007-10-26 19:48
(Received via mailing list)
Robin Stocker wrote:
> How do the error messages look for these cases? Are the spaces preserved
> and do they show up in the messages?
>

Yes, the spaces are preserved.

With this patch, if I do ruby -e "class rescue", I get

   -e:1: syntax error, unexpected "rescue "

The space is weird, but I think (or at least I thought last night) that
it is better than this:

   -e:1: syntax error, unexpected "rescue modifier"

To me, the quotes imply that the word "modifier" appeared in the program
text.  (I really wish that bison allowed us to embed our own quotes in
the token names when we wanted to and didn't insert them for us--this is
a bigger problem for tokens that don't map to individual keywords or
operators.)

Maybe a way to avoid the spaces would be to choose some non-printing
ASCII character instead of space.  (like ^G, to ring the terminal bell
on errors :-)

Or, maybe we just use the same string "rescue" for both the statement
and modifiers forms of the keyword and just live with the warning that
bison issues.  That is a little scary, though since apparently the
strings in quotes can be used as alternatives to the token identifiers
in the grammar itself, so there really ought to be a one-to-one mapping.

Another possibility is to hack yyerror to clean up the error messages
for us, possibly stripping the quotation marks (so that we could have
"\"rescue\" modifier" with only the keyword in quotes and removing any
hacky spaces or non-printing characters we stuck into the token strings
to fool bison.  I haven't done any real C string manipulation complete
with memory allocation and freeing in 10 years, however, so I'm a little
reluctant to attempt this myself...

  David

  David
1018fce89dab3bedcc380fe37d90c9c0?d=identicon&s=25 David Flanagan (Guest)
on 2007-10-26 20:24
(Received via mailing list)
David Flanagan wrote:

> Another possibility is to hack yyerror to clean up the error messages
> for us, possibly stripping the quotation marks (so that we could have
> "\"rescue\" modifier" with only the keyword in quotes and removing any
> hacky spaces or non-printing characters we stuck into the token strings
> to fool bison.  I haven't done any real C string manipulation complete
> with memory allocation and freeing in 10 years, however, so I'm a little
> reluctant to attempt this myself...

Okay, despite my protests of not knowing how to do this, I figured it
out.  The yyerror function already uses ALLOCA_N once to allocate a
string on the stack, so I don't feel bad about using it again.

Now, before printing the error message it strips all double quotes out
of it.  So we can get error messages like "unexpected 'rescue' modifier"
And I can get "end of file" in an error message without quotes.
Furthermore, we can solve the issue of having to have unique strings for
each token by just inserting additional escaped double-quotes characters
where I was using spaces before.  They'll be stripped out.

Its kind of hacky, but it makes for nice error messages. Updated patch
is attached.

  David
F1d6cc2b735bfd82c8773172da2aeab9?d=identicon&s=25 Nobuyoshi Nakada (nobu)
on 2007-10-26 20:48
(Received via mailing list)
Hi,

At Wed, 24 Oct 2007 04:15:05 +0900,
David Flanagan wrote in [ruby-core:12886]:
> syntax error, unexpected "end-of-file", expecting "end"
>
> It doesn't tell you where the outermost open block begins, but at least
> the message isn't so cryptic.  I don't know any way to get rid of the
> quotes around "end-of-file".

Which version of bison do you use?  Bison 2.3 seems to strip
the quotes.
1018fce89dab3bedcc380fe37d90c9c0?d=identicon&s=25 David Flanagan (Guest)
on 2007-10-26 21:08
(Received via mailing list)
Nobuyoshi Nakada wrote:
> Which version of bison do you use?  Bison 2.3 seems to strip
> the quotes.
>

I've got bison 2.0 by default on my Fedora Core 4 system.  See my most
recent post on this thread for a yyerror hack that strips the quotes for
older systems like mine.

  David
This topic is locked and can not be replied to.