Forum: Ruby ruby_parser 3.4.0 Released

Announcement (2017-05-07): is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see and for other Rails- und Ruby-related community platforms.
Ryan Davis (Guest)
on 2014-02-05 04:20
(Received via mailing list)
ruby_parser version 3.4.0 has been released!

* home: <>
* bugs: <>
* rdoc: <>

ruby_parser (RP) is a ruby parser written in pure ruby (utilizing
racc--which does by default use a C extension). RP's output is
the same as ParseTree's output: s-expressions using ruby's arrays and
base types.

As an example:

    def conditional1 arg1
      return 1 if arg1 == 0
      return 0


    s(:defn, :conditional1, s(:args, :arg1),
        s(:call, s(:lvar, :arg1), :==, s(:lit, 0)),
        s(:return, s(:lit, 1)),
      s(:return, s(:lit, 0)))

Tested against 801,039 files from the latest of all rubygems (as of

* 1.8 parser is at 99.9739% accuracy, 3.651 sigma
* 1.9 parser is at 99.9940% accuracy, 4.013 sigma
* 2.0 parser is at 99.9939% accuracy, 4.008 sigma


### 3.4.0 / 2014-02-04

* 1 major enhancement:

  * Replaced hand-written/optimized f'd-up lexer with an oedipus_lex
    generated lexer. This makes it roughly 40-50% faster.

* 30 minor enhancements:

  * 2.0: Added support for a.b c() do d end.e do |f| g end
  * 2.0: Added support for a.b c() do d end.e f do |g| h end
  * Added -s flag to ruby_parse_extract_error to output timings.
  * Added RubyLexer #command_state and #last_state to deal with
oedipus_lex differences.
  * Added String#lineno and #lineno= because I'm a bad bad person.
  * Added a bunch of RubyLexer scanning methods: beginning_of_line?,
check, scan, etc.
  * Added a bunch of process_* methods extracted from old yylex.
process_amper, etc.
  * Added lib/.document to save my laptop's battery from pain and
  * Adjust lineno when we lex a bunch of blank lines.
  * Attach lineno to tIDENTIFIER values (strings, ugh)
  * Cleaned up and re-ordered node_assign to be faster (ordered by
actual occurrance).
  * Extend RubyParserStuff#gettable to set the lineno if it comes in
with the id.
  * Extended RubyParserStuff#new_case to take line number.
  * Finally dropped RPStringScanner's BS #current_line.
  * Finally dropped RPStringScanner's BS line number calculation
  * Implemented Sexp#add_all since we now have a test case for it.
  * Removed :call case of node_assign. I don't think it is possible.
  * Removed RubyLexer #extra_lines_added. No longer used. Complex
heredoc lineno's possible screwed up.
  * Removed RubyLexer#parse_number. Handled by oedipus_lex.
  * Removed RubyLexer#yacc_value now that next_token returns pairs.
  * Removed RubyLexer's @src. Now taken care of by oedipus_lex.
  * Removed RubyParser#advance. RubyParser#next_token takes care of
everything now.
  * Removed RubyParserExtras#arg_add. (presidentbeef! YAY!)
  * Removed lib/gauntlet_rubyparser.rb. I just don't use it anymore. Too
  * RubyLexer#is_label_possible? doesn't need an arg
  * RubyLexer#process_token is now a normal oedipal lexer method.
  * RubyParser#next_token now expects RubyLexer#next_token to return a
pair (type, val).
  * TRYING a new scheme to figure out encodings... but I'm about to
throw in the towel. I hate this stuff so much.
  * Turned off oedipus_lex's automatic line counting. (pushing to
oedipus_lex soon).
  * Updated to oedipus_lex 2.1+.

* 7 bug fixes:

  * 1.8: Properly parse `a (:b, :c, :d => :e)`. (presidentbeef)
  * Fixed lexing symbol!= vs symbol!. Please use your spacebar. Think of
the children.
  * Fixed line for dstr spanning multiple lines via backslash.
  * Fixed line numbers for odd cases with trailing whitespace.
  * Fixed line numbers on ambiguous calls w/ gvar/ivar args.
  * Max out unicode hex values to 2-4 or 2-6 chars or pack will overflow
and puke.
  * Removed ESC_RE from RubyLexer. Must have slipped through.
This topic is locked and can not be replied to.