Ruby language parser in ruby

I’m looking for a ruby language parser written in ruby, that I can hack
to play about with generating other ruby-like languages.

I’ve found RubyParser. Are there any other options I should be looking
at?

  • I see ruby 1.9 has ripper, but it’s written as a C extension. I want
    something in pure ruby.

  • ParseTree is also in C I believe.

  • Possibly could look at the internals of rubinius?

  • Anything else?

Thanks,

Brian.

On Nov 29, 2009, at 03:23 , Brian C. wrote:

I’m looking for a ruby language parser written in ruby, that I can hack
to play about with generating other ruby-like languages.

I’ve found RubyParser. Are there any other options I should be looking
at?

  • I see ruby 1.9 has ripper, but it’s written as a C extension. I want
    something in pure ruby.

well to be fair. ripper is yacc + C. ruby_parser is racc + ruby. Until
someone writes a recursive descent parser, you’ll never have PURE ruby
(and it wouldn’t be quite as maintainable). I’m actually working on
that, but I don’t yet know if I’ll succeed.

  • ParseTree is also in C I believe.

and doesn’t parse. It just uses ruby to parse and grabs the internal
ast.

  • Possibly could look at the internals of rubinius?

Was using ruby_parser, but now is also C… you know… ruby in ruby. :confused:

I think your best bet at this time is ruby_parser.

On Sun, Nov 29, 2009 at 7:17 AM, Ryan D. [email protected]
wrote:

something in pure ruby.

well to be fair. ripper is yacc + C. ruby_parser is racc + ruby. Until someone writes a recursive descent parser, you’ll never have PURE ruby (and it wouldn’t be quite as maintainable). I’m actually working on that, but I don’t yet know if I’ll succeed.

That would be cool if it’s possible. Old Smaltalkers love to see
recursive descent parsers written in the target language!


Rick DeNatale

Blog: http://talklikeaduck.denhaven2.com/
Twitter: http://twitter.com/RickDeNatale
WWR: http://www.workingwithrails.com/person/9021-rick-denatale
LinkedIn: http://www.linkedin.com/in/rickdenatale

Ryan D. wrote:

well to be fair. ripper is yacc + C. ruby_parser is racc + ruby. Until
someone writes a recursive descent parser, you’ll never have PURE ruby

That’s fine; if it’s using a standard parser generator I don’t need to
hack that, just the language itself.

I think your best bet at this time is ruby_parser.

Sounds good to me, thank you.

Marnen Laibow-Koser wrote:

Rubinius, perhaps? Or make your own with Treetop?

I’m pretty familiar with CFGs and tools like yacc, or at least, I was
some years ago. What bothers me about ruby syntax is that there are a
number of aspects that I’m not sure how to map to a CFG, for example

puts (a).abs

being different from

puts(a).abs

… or,

a = b

  • c

being treated differently to

a = b +
c

So basically I was looking to steal ideas how to deal with these sorts
of cases.

Indeed, an existing grammar for ruby would be a good starting point, and
goggling for this I see there’s a project “rubygrammar”, with Charles
Nutter as one of the admins. Anyone know what state this is in? I see 61
commits in the repo, nothing in the last three years, and the grammar
looks too simple to be true :slight_smile:

Brian C. wrote:

I’m looking for a ruby language parser written in ruby, that I can hack
to play about with generating other ruby-like languages.

I’ve found RubyParser. Are there any other options I should be looking
at?

Rubinius, perhaps? Or make your own with Treetop?

Best,

Marnen Laibow-Koser
http://www.marnen.org
[email protected]

Maglev’s got one:

Jason

On Nov 29, 2009, at 10:47 , Jason R. wrote:

Maglev’s got one:

maglev/src/kernel/parser at master · MagLev/maglev · GitHub

also ruby_parser… WAY WAY WAY hacked up, but ruby_parser.

I’d like to incorporate some of their ideas back into ruby_parser, but
it is sooo different, that merging is nearly impossible.

On Nov 29, 2009, at 10:29 , Marc H. wrote:

Was using ruby_parser, but now is also C… you know… ruby in ruby. :confused:

Awww…

I still thought rubinius was going with the pure ruby approach. :frowning:

yeah. well. They left that idea FAR behind a long time ago. Sad really,
and I don’t think there are any plans to push towards a more pure ruby
approach.

Was using ruby_parser, but now is also C… you know… ruby in ruby. :confused:

Awww…

I still thought rubinius was going with the pure ruby approach. :frowning:

Brian C. wrote:

I’m looking for a ruby language parser written in ruby, that I can hack
to play about with generating other ruby-like languages.

  • Anything else?

redparse

-r

On Sunday 29 November 2009 03:25:17 pm Ryan D. wrote:

On Nov 29, 2009, at 10:29 , Marc H. wrote:

Was using ruby_parser, but now is also C… you know… ruby in ruby. :confused:

Awww…

I still thought rubinius was going with the pure ruby approach. :frowning:

yeah. well. They left that idea FAR behind a long time ago.

I’m not sure they wanted to do that for everything, from the parsing to
the
VM, though that would’ve been cool.

Probably the thing I love most about Rubinius, as it was when I last
looked,
is that it fulfills some of the original promise of Ruby. When I first
started
looking at Ruby, people said “Most of Ruby is written in Ruby.” I took
this to
mean a very lightweight language with a huge standard library. Well,
much of
the standard library, and most of the core libraries, are written in C,
for
performance reasons, significantly limiting what I can do in Ruby.

For example, back when I was writing autoloader, I can’t remember why,
but I
really wanted Kernel#autoload to actually use Kernel#require when the
module
is accessed. Ideally, I wanted something like this:

autoload :FooBar do |name|
require ‘foo_bar’
end

Unfortunately, Kernel#autoload doesn’t do that. Last I checked, it
directly
calls the C code that Kernel#require maps onto – at the time, this
effectively bypassed Rubygems as a whole.

It was to the point where to create an alternate way to autoload stuff,
I was
going to have to hack the Ruby source, in C.

…really?

Contrast this to Rubinius. The source was clean and readable, and I was
easily
able to figure out how to accomplish what I wanted, in pure Ruby,
without
modifying any of the source.

By the way: It doesn’t have to be slower. Remember Google’s Javascript
engine,
v8? It implements the Javascript standard library in Javascript, yet v8
still wins in the benchmarks against other, more conservative
Javascript
implementations.

On Nov 30, 2009, at 01:17 , David M. wrote:

I’m not sure they wanted to do that for everything, from the parsing to the
VM, though that would’ve been cool.

That was my understanding when I started working professionally on it.

Probably the thing I love most about Rubinius, as it was when I last looked,
is that it fulfills some of the original promise of Ruby. When I first started looking at Ruby, people said “Most of Ruby is written in Ruby.”

I’ve never ever heard such a thing uttered about ruby. It is beyond
obvious when you look at any version of the tarball.

Last time I looked at redparse I couldn’t even get the tests to run.
Only through lots of hacking did I get it executing and that was just to
see how fast it was. Caleb found a horrible ruby package that both
redparse and ruby_parser choke on (think: it’ll finish right around the
time of heat death of the universe… and will probably contribute
greatly to it). Otherwise, it didn’t seem to be faster or better in any
area (esp given how much work it was to get it to work at all).

Things may have changed since then.

LOL.

I know he’s working on making it 1.9 compatible, but I think it’s stable
for 1.8. It’s niche is that it’s written in “human generated” ruby.
-r

On Nov 30, 2009, at 10:17 , Roger P. wrote:

Brian C. wrote:

I’m looking for a ruby language parser written in ruby, that I can hack
to play about with generating other ruby-like languages.

  • Anything else?

redparse

Last time I looked at redparse I couldn’t even get the tests to run.
Only through lots of hacking did I get it executing and that was just to
see how fast it was. Caleb found a horrible ruby package that both
redparse and ruby_parser choke on (think: it’ll finish right around the
time of heat death of the universe… and will probably contribute
greatly to it). Otherwise, it didn’t seem to be faster or better in any
area (esp given how much work it was to get it to work at all).

Things may have changed since then.

Mason K. wrote:

Since I am taking some graduate level course in artificial intelligence
at
UCF, under Dr. Fernando Gomez, this is also a topic of interest to me.

Although many people find programming in Ruby more natural than
programming in other languages, I’ve never heard it lumped in with
‘natural languages’ before…

Since I am taking some graduate level course in artificial intelligence
at
UCF, under Dr. Fernando Gomez, this is also a topic of interest to me.
The
only languages I’ve seen that are currently being used are Lisp, Java,
and
Python. If you Google Natural Language Tool Kit, NLTK, you will find
the
Brill tagger and parsers written in Python. Fortunately, if you know
Ruby,
Python is just a step away, not a big transition. No sense in
reinventing
the wheel again.

If you go with Python, I recommend the book “Natural Language
Processing
with Python”
by Bird, Klein, and Loper. The book tells you how to use
the
Natural Language Tool Kit you download from http://www.nltk.org/download
and
http://www.nltk.org/getting-started tells you how to get started and
guides
for the code is found at
http://nltk.googlecode.com/svn/trunk/doc/howto/index.html and
http://nltk.googlecode.com/svn/trunk/doc/howto/tag.html shows you how to
use
the taggers. You will need to down load numpy.py and import it also,
although it is only suggested in the descriptions for the tagger.

And as a double plus you can read the twitters from Dr. Hugo Lui.
@dochugo

No Sam

Brian C. wrote:

I’m looking for a ruby language parser written in ruby, that I can
hack
to play about with generating other ruby-like languages.

I’ve found RubyParser. Are there any other options I should be
looking
at?

  • I see ruby 1.9 has ripper, but it’s written as a C extension. I
    want
    something in pure ruby.

  • ParseTree is also in C I believe.

  • Possibly could look at the internals of rubinius?

  • Anything else?

Thanks,

Brian.

irb does quite a good job. :wink:

daz

daz wrote:

Brian C. wrote:

I’m looking for a ruby language parser written in ruby

irb does quite a good job. :wink:

Interesting. I’d forgotten that irb would have to parse multi-line input
before handing it off to eval. Thanks!

A minor problem with the redparse gem is that it gives most of the files
root-only permissions. I just did ‘sudo gem install redparse’

$ pwd
/usr/lib/ruby/gems/1.8/gems/redparse-0.8.3
$ ls -l lib
total 96
drwxr-xr-x 2 root root 4096 2009-12-04 14:07 redparse
-rwx------ 1 root root 89730 2009-12-04 14:07 redparse.rb
$ ls -l lib/redparse
total 188
-rwx------ 1 root root 2542 2009-12-04 14:07 babynodes.rb
-rwx------ 1 root root 6348 2009-12-04 14:07 babyparser.rb
-rwx------ 1 root root 10545 2009-12-04 14:07 decisiontree.rb
-rw-r–r-- 1 root root 13385 2009-12-04 14:07 generate.rb
-rwxr-xr-x 1 root root 134917 2009-12-04 14:07 node.rb
-rwxr–r-- 1 root root 2041 2009-12-04 14:07 problemfiles.rb
-rwx------ 1 root root 2664 2009-12-04 14:07 reg_more_sugar.rb
-rwxr–r-- 1 root root 40 2009-12-04 14:07 version.rb

Easily fixable of course:

sudo bash
find . -type f | xargs chmod +r
find . -type d | xargs chmod +rx

Most of those .rb files don’t need +x either, since they don’t have a
shebang line.

Looking at this code - I don’t think I would dare hack it. I think
ruby_parser is more what I was looking for.