Forum: Ruby Writing parsers?

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
Fd870ca69f1243766bd8697ea3415df2?d=identicon&s=25 Paatsch, Bernd (Guest)
on 2006-02-14 00:59
(Received via mailing list)
Hello,

I got this great task assigned to write a parser and looking at the
files to
parse is not very trivial. Does anybody know where to find a website
that
would explain steps and pitfalls to avoid writing a parser?
Any suggestion/help in is appreciated.

Thanks,
Bernd
430ea1cba106cc65b7687d66e9df4f06?d=identicon&s=25 David Vallner (Guest)
on 2006-02-14 01:47
(Received via mailing list)
DÅ?a Utorok 14 Február 2006 00:59 Paatsch, Bernd napísal:
> Hello,
>
> I got this great task assigned to write a parser and looking at the files
> to parse is not very trivial. Does anybody know where to find a website
> that would explain steps and pitfalls to avoid writing a parser?
> Any suggestion/help in is appreciated.
>
> Thanks,
> Bernd

http://epaperpress.com/lexandyacc/ seems to be a useful resource,
provided you
already know some theory behind formal grammars and such. The two tools
and
their derivatives are pretty much the open source standard for writing
parsers. I believe there are Ruby bindings / variants of both.

ANTLR is also somewhat used, but you're probably looking at Java there.

David Vallner
682fff6db11e1a150d6ce17f3b862448?d=identicon&s=25 Doug H (Guest)
on 2006-02-14 02:03
(Received via mailing list)
http://www.google.com/search?hl=en&q=parser

:) No seriously, check out ANTLR.  Unless you are supposed to write the
parser from scratch.
If you want to do it in ruby, there are options like:
http://split-s.blogspot.com/2005/12/antlr-for-ruby.html
http://www.zenspider.com/ZSS/Products/CocoR/
Be223e60c56535a0e465b84243aeb0d1?d=identicon&s=25 Timothy Goddard (Guest)
on 2006-02-14 15:13
(Received via mailing list)
I just whipped this up in a bit of free time. It may be a decent
starting point for a pure ruby parser. Note that there is no lookahead
ability.

class ParseError < StandardError; end

class Parser

  @@reductions = {}
  @@reduction_procs = {}
  @@tokens = {}
  @@token_values = {}

  # Parse either a string or an IO object (read all at once) using the
rules defined for this parser.
  def parse(input)
    stack = []
    value_stack = []
    text = input.is_a?(IO) ? input.read : input.dup
    loop do
      token, value = retrieve_token(text)
      stack << token
      value_stack << value
      reduce_stack(stack, value_stack)
      if text.length == 0
        if stack.length == 1
          return stack[0], value_stack[0]
        else
          raise ParseError, 'Stack failed to reduce'
        end
      end
    end
  end
  protected

  # Retrieve a single token from the input text and return an array of
it and its value.
  def retrieve_token(text)
    @@tokens.each do |regexp, token|
      if md = text.match(regexp)
        text.gsub!(regexp, '')
        return [token, @@token_values[token] ?
@@token_values[token].call(md.to_s) : nil]
      end
    end
    raise ParseError, "Invalid token in input near #{text}"
  end

  # Compare the stack to reduction rules to reduce any matches found
  def reduce_stack(stack, value_stack)
    loop do
      matched = false
      @@reductions.each do |tokens, result|
        if tokens == stack[stack.length - tokens.length, tokens.length]
          start_pos = stack.length - tokens.length
          stack[start_pos, tokens.length] = result
          value_stack[start_pos, tokens.length] =
@@reduction_procs[tokens] ?
@@reduction_procs[tokens].call(value_stack[start_pos, tokens.length]) :
nil
          matched = true
          break
        end
      end
      return unless matched
    end
  end

  def self.token(regexp, token, &block)
    @@tokens[Regexp.new('\A' + regexp.to_s)] = token
    @@token_values[token] = block
  end

  def self.rule(*tokens, &block)
    final = tokens.pop
    tokens += final.keys
    result = final.values.first
    @@reductions[tokens] = result
    @@reduction_procs[tokens] = block
  end
end

class TestParser < Parser
  token /foo/i, :foo do |s|
    s.upcase
  end
  token /bar/i, :bar do |s|
    s.downcase
  end
  token /mega/i, :mega do |s|
    3
  end
  rule :foo, :bar => :foobar do |foo, bar|
    foo + bar
  end
  rule :mega, :foobar => :megafoobar do |mega, foobar|
    foobar * mega
  end
end
5befe95e6648daec3dd5728cd36602d0?d=identicon&s=25 Robert Klemme (Guest)
on 2006-02-14 15:13
(Received via mailing list)
Paatsch, Bernd wrote:
> Hello,
>
> I got this great task assigned to write a parser and looking at the
> files to parse is not very trivial. Does anybody know where to find a
> website that would explain steps and pitfalls to avoid writing a
> parser?
> Any suggestion/help in is appreciated.

http://raa.ruby-lang.org/project/racc/
http://raa.ruby-lang.org/project/ruby-yacc/

    robert
Eb9493c94d8db9887e5f15284d2c767f?d=identicon&s=25 unknown (Guest)
on 2006-02-15 08:52
(Received via mailing list)
In article <1139916679.044875.75620@g47g2000cwa.googlegroups.com>,
Timothy Goddard <interfecus@gmail.com> wrote:
>  @@tokens = {}
>      stack << token
>  end
>      end
>          start_pos = stack.length - tokens.length
>    end
>    result = final.values.first
>    s.downcase
>end
>

This is a bit like Grammar:
http://grammar.rubyforge.org/0.5/

Phil
Be223e60c56535a0e465b84243aeb0d1?d=identicon&s=25 Timothy Goddard (Guest)
on 2006-02-15 11:20
(Received via mailing list)
Grammar looks much more similar to Spirit, a C++ parser which looks
really simple to use. It uses a very simple domain-specific language
for writing grammars in C++ code. It's part of the boost libraries. It
would be my first choice for a medium-speed parser that could be used
quite easily from Ruby with just a few joining bits of C. Parsers in
the style of YACC or Bison are much faster again, but the added
complexity of defiing grammar probably makes using it a premature
optimisation for most tasks.
This topic is locked and can not be replied to.