Looking for an appropiate parser generator

Hi, I’m new in this maillist so hello to all.

I want to do a SIP protocol parser in Ruby (SIP protocol is very similar
to
HTTP) so I’m looking for the most appropiate parser in Ruby. Also, it’s
the
first time I try to do something like this so I need first lot of doc
reading
(yacc, lex, LARL, LL, BNF…).

I’d just like to know with parser is a good option for my purpose since
I’ve
found “too much” of them:

  • rex
  • ruby-lex
  • racc
  • ruby-yacc
  • coco-rb
  • TreeTop
  • Ragel

For now I don’t look for the fastest and most efficient parser, maybe
just for
the easiest one. I will read those days about Yacc and Lex since I
assume
they are the base of all of them, and I know that typically Yacc is used
in
conjunction with Lex so:
If I choose Racc, will I need Rex?
If I choose Ruby-Yacc, will I need Ruby-Lex?
Note that I want to get Ruby code, not C, C++ or others.

My purpose is just receiving SIP requests (similar to HTTP requests) and
parse
them (maybe into Ruby objects) to work with them (implement a SIP
stack).

They are lots of options and I’m getting a little “lost” with so many
doc to
read. Any orientation please?

Thanks a lot and best regards.

On 15 Mar 2008, at 23:21, Iñaki Baz C. wrote:

I’d just like to know with parser is a good option for my purpose

My purpose is just receiving SIP requests (similar to HTTP requests)
and parse
them (maybe into Ruby objects) to work with them (implement a SIP
stack).

They are lots of options and I’m getting a little “lost” with so
many doc to
read. Any orientation please?

The SIP protocol is in principle simple enough that you could probably
get by rolling your own custom parser by hand. My colleague
implemented a console-driven SIP stack this way and whilst the code is
inefficient compared to a table-driven parser it’s eminently more
readable.

However if you do want to go down the parser-generator route for
performance reasons then look at the Ragel parser in Mongrel as that’s
very efficient and highly conformant with the HTTP RFC specs (given
some of the boundary cases in SIP conformance is probably high on your
list of priorities). As far as I’m aware Ragel produces parsers in C,
but the Mongrel code will show you how to turn that into a Ruby native
extension.

Ellie

Eleanor McHugh
Games With Brains

raise ArgumentError unless @reality.responds_to? :reason

El Domingo, 16 de Marzo de 2008, Eleanor McHugh
escribió:

The SIP protocol is in principle simple enough

opss, sure it’s not simple at all, a lot of amiguations, each header has
its
own format, lots of extensions in lot of RFC’s… buff… simple?

that you could probably
get by rolling your own custom parser by hand.
It’s what I’m doing for now but maybe it’s better using an existing one.
No
idea.

but the Mongrel code will show you how to turn that into a Ruby native
extension.

AFAIK Ragel can write to Ruby since verison 6.0:
devchix.com at Directnic

Thanks a lot, I’ll try Ragel.

On 16 Mar 2008, at 02:42, Iñaki Baz C. wrote:

El Domingo, 16 de Marzo de 2008, Eleanor McHugh escribió:

The SIP protocol is in principle simple enough

opss, sure it’s not simple at all, a lot of amiguations, each header
has its
own format, lots of extensions in lot of RFC’s… buff… simple?

I’ve probably spent too much time hanging around SIP obsessives :wink:

AFAIK Ragel can write to Ruby since verison 6.0:
devchix.com at Directnic

Thanks a lot, I’ll try Ragel.

I’ll have to take a look at that myself.

Ellie

Eleanor McHugh
Games With Brains

raise ArgumentError unless @reality.responds_to? :reason

On Sat, Mar 15, 2008 at 6:21 PM, Iñaki Baz C. [email protected]
wrote:

For now I don’t look for the fastest and most efficient parser, maybe just for
the easiest one.

If you want easy, Treetop wins in that regard by a long shot versus
racc, ruby-yacc, or ragel.

Daniel Brumbaugh K.

If you want easy, Treetop wins in that regard by a long shot versus
racc, ruby-yacc, or ragel.

In rubyquiz 155 (Ruby Quiz - Parsing JSON (#155)), Grammar performed
pretty well. I haven’t tried it out myself yet but the code looks
quite readable (see “some” submissions by E Mahurin).

Regards,
Thomas.