Forum: Ruby RedParse 0.8.0 released

Announcement (2017-05-07): is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see and for other Rails- und Ruby-related community platforms.
Ab870531383eea6e4d9110317f5401e7?d=identicon&s=25 Caleb Clausen (Guest)
on 2008-10-23 03:41
(Received via mailing list)
= RedParse


RedParse is a ruby parser written in pure ruby. Instead of YACC or
ANTLR, it's parse tool is a home-brewed "compiler-interpreter". (The
tool is LALR(1)-equivalent and the 'parse language' is pretty nice,
even in it's current crude form.)

My intent is to have a completely correct parser for ruby, in 100%
ruby. It's not all there yet, but I'm getting pretty close. Currently,
RedParse can parse slightly in excess of 99% of ruby files found in
the wild. For known problems, see below.

* RedParse requires RubyLexer, my hand-coded lexer for ruby. It also
  uses Reg, (a pattern-matcher). RubyLexer depends on Sequence,
  (external iterators). Reg depends on Sequence's predecessor, Cursor,
  altho Cursor isn't used at all in RedParse. The (long-delayed) next
  version of Reg will use Sequence. To summarize:
  *  RedParse 0.9.0 requires RubyLexer>=0.7.1 and Reg>=0.4.7
  *  RubyLexer 0.7.1 requires Sequence>=0.2.0
  *  Reg 0.4.7 requires Cursor (not really needed here)
* All are available as gems. (Or tarballs on rubyforge, if you must.)

* gem install redparse #(if root as necssary)


RedParse is available under the Library General Public License (LGPL).
Please see COPYING.LGPL for details.

== Benefits:

* Pure ruby, through and through. No part is written in C, YACC,
  ANTLR, lisp, assembly, intercal, befunge or any other language
  except ruby.
* Pretty AST trees (at least, I think so). (To program for, not
  necessarily to look at.)
* AST trees closely mirror the actual structure of source code.
* ParseTree format output too, if you want that.
* Did I mention that there's no YACC at all? YACC grammars are
  notoriously difficult to modify, (I've never successfully done it)
  but I've found it easy, at times even pleasant to modify the parse
  rules of this grammar as necessary.
* Relatively small parser: 70 rules in 240 lines
  (vs (by my count) 320 rules in 2200 lines for MRI 1.8.7. This is
  by no means a fair comparison, tho, since RubyLexer does a lot
  more than MRI's lexer, and MRI's 2200 lines include its
  actions (which occupy somewhere under 3100 lines in RedParse).
  Also, what is a rule? I counted most things which required a
  separate action in MRI's parser, I'm not sure if that's fair.
  But in the end, I still think RedParse is still much easier to
  understand than MRI's parse.y.)
* "loosey-goosey" parser happily parses many expressions which normal
  ruby considers errors.

== Drawbacks:

* Pathetically, rediculously slow (to be addressed soon).
* Error handling is very minimal right now.
* No warnings at all.
* Some expressions aren't parsed correctly. see below.
* Line numbers in ParseTrees not supported yet.
* AST tree format is not finalized yet.
* Unit test takes a fairly long time.
* Lots of warnings printed during unit test.
* Debugging parse rules is not straightforward.
* No support for ruby 1.9.
* No support for any charset but ascii (until rubylexer gets it).
* "loosey-goosey" parser happily parses many expressions which normal
  ruby considers errors.


  #simple example of usage:

  require 'redparse'"some ruby code here")

    case node
    when CallNode: #... do something with method calls
    when AssignNode: #... maybe alter assignments somehow
    #.... and so on

  #presumably tree was altered somehow in the walk-"loop" above
  #when done mucking with the tree, you can turn it into one
  #of two other formats: ParseTree s-exps or (experimental)
  #ruby source code.

  tree.to_parsetree #=> turns a tree into an ParseTree-style s-exp.

  tree.unparse({})  #=> turns a tree back into ruby source code.

  #to understand the tree format, you must understand the node classes,

== Known failing expressions
*  The following expressions are known to parse incorrectly currently:
* 1, V
* "#{}"""
* $11111111111111111111111111111111111111111111111111111111111111111111
* begin;mode;rescue;o_chmod rescue nil;end
* case F;when G; else;case; when j; end;end
* def i;"..#{@@c = 1}";end
* e { |c|; print "%02X" % c }
* {|f|  ;  }
* %W(white\  \  \ \  \ space).should == ["white ", " ", "  ", " space"]
* %w[- \\ e]
* %w[- \\ ]
* module A; b; rescue C=>d; e; else g; ensure f; end
* class A; b; rescue C=>d; e; else g; ensure f; end
* class<<A; b; rescue C=>d; e; else g; ensure f; end

== Homie doan' play dat
* These expressions don't parse the same as in MRI, I believe because of
  bug(s) in MRI.
* p = p m %(1)
* p=556;p (e) /a
This topic is locked and can not be replied to.