Forum: Ruby Grammars (mini-scripting languages)

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
2b1b2c4661dd593b46ecd24492bc243d?d=identicon&s=25 Michael Judge (bluetrust)
on 2006-02-02 01:57
I'm working on implementing mini-scripting languages for two different
projects, so I'm building a framework that could handle the task
generically.

Does this seem like a good way to approach it?
    1. Store each command's matching regular expression and ruby code
within the database. (sample fixture below)
    2. For each line in script:
         Test line against each command's corresponding regular
expression
         If matched, execute the command's ruby code using an
instance_eval.

My thoughts are:
    1. Storing executable code in the database is a security problem
    2. instance_eval is slow
    3. The alternative (a big if/elsif tree) would span many pages and
be unweildy.

Have a better suggestion?


Sample code:
	def compile
		syntax.each do |line|
			command = commands.find { |c| c.match? line }
			raise "Command not found that can process '#{line}'" if command.nil?
			instance_eval command.ruby
		end
	end


Sample commands fixture:

label:
  id: 1
  name: label
  regexp: ^Q (.*)$
  ruby: puts "$1\n"

single-punch:
  id: 2
  name: single-punch
  regexp: ^X-(\d+) (.*)$
  ruby: puts "  o $2\n"

multiple-punch:
  id: 3
  name: multiple-punch
  regexp: ^M-(\d+) (.*)$
  ruby: puts "  [ ] $2\n"

blank-line:
  id: 4
  name: blank-line
  regexp: ^\s*$
  ruby: # Do nothing
A9b6a93b860020caf9d2d1d58c32478f?d=identicon&s=25 Ross Bamford (Guest)
on 2006-02-02 03:16
(Received via mailing list)
On Thu, 2006-02-02 at 09:57 +0900, Michael Judge wrote:

>     1. Storing executable code in the database is a security problem
>     2. instance_eval is slow
>     3. The alternative (a big if/elsif tree) would span many pages and
> be unweildy.
>
> Have a better suggestion?

Not necessarily better, but how about something like:

	class Commands
	  def Q(args)
	    puts args
	  end

	  def X(args)
	    if args =~ /^(\d+) (.*)$/
	      puts "  o #{$2}"
	    end
	  end

	  def M(args)
	    if args =~ /^(\d+) (.*)$/
	      puts "  [ ] #{$2}"
	    end
	  end

	  def dispatch(line)
	    if line =~ /([QXM])-?(.*)/
	      send($1.intern, $2)
	    else
	      raise "Invalid input: #{line}"
	    end
	  end
	end

	s = <<EOS
	Q Just a label
	X-23 Single punched
	M-11 Multi punched
	J-12 Bad input
	EOS

	cmds = Commands.new
	s.each { |c| cmds.dispatch(c) }

(Obviously I guessed a bit with the input format).
Output:

	 Just a label
	  o Single punched
	  [ ] Multi punched
	-:22:in `dispatch': Invalid input: J-12 Bad input (RuntimeError)
	        from -:35
	        from -:35
2b1b2c4661dd593b46ecd24492bc243d?d=identicon&s=25 Michael Judge (bluetrust)
on 2006-02-02 05:17
Ross Bamford wrote:
> On Thu, 2006-02-02 at 09:57 +0900, Michael Judge wrote:
>
>>     1. Storing executable code in the database is a security problem
>>     2. instance_eval is slow
>>     3. The alternative (a big if/elsif tree) would span many pages and
>> be unweildy.
>>
>> Have a better suggestion?
>
> Not necessarily better, but how about something like:
>
> 	class Commands
>         [snip]
> 	  def dispatch(line)
> 	    if line =~ /([QXM])-?(.*)/
> 	      send($1.intern, $2)
> 	    else
> 	      raise "Invalid input: #{line}"
> 	    end
> 	  end
> 	end

That's neat, Ross.  I wasn't familiar with the send command.  Looks like
the consequence is that the grammar has to fit an easy regular
expression or you'd be duplicating it in the match and again in the
method definition... Well, that's not necessarily a bad thing.
Consistency is good too.

I need to think about this.

What do programmers normally do when they have a case statement that's
30 or more items long?  Previously I've just left it as a case statement
and spent the life of the project ticked at it.
A9b6a93b860020caf9d2d1d58c32478f?d=identicon&s=25 Ross Bamford (Guest)
on 2006-02-02 09:46
(Received via mailing list)
On Thu, 2006-02-02 at 13:17 +0900, Michael Judge wrote:
> > Not necessarily better, but how about something like:
> > 	end
>
> That's neat, Ross.  I wasn't familiar with the send command.  Looks like
> the consequence is that the grammar has to fit an easy regular
> expression or you'd be duplicating it in the match and again in the
> method definition... Well, that's not necessarily a bad thing.
> Consistency is good too.
>

Agreed, but it did bug me a bit, too :) Depending on the actual input
format you could optimise that away though I think, e.g.

	class Commands
	  def M(args)
	    # Notice that regexp here is now responsible for
	    # for handling the '-' after the initial letter.
	    if args =~ /^-(\d+) (.*)$/
	      puts "  [ ] #{$2}"
	    end
	  end

	  def dispatch(line)
	    begin
	      send(line.slice!(0,1), line)
	    rescue NoMethodError
	      raise "Invalid input: #{line}"
	    end
	  end
	end

That way you're doing only one match per dispatch, and validating
implicitly (Ruby will raise NoMethodError if the command is bad). Since
we're forcing only a single letter, it shouldn't be possible for people
to input e.g. 'exit-666 1' or something to breach security.

A win with this approach I think is that it keeps everything where it
should be, i.e. the commands themselves are responsible for processing
their arguments, however they see fit. Also, you can easily add new
commands at runtime, simply by definining a new method. There's no
'command registry' anywhere.

One other change I'd make to my previous post would be to make the
command methods private.

> What do programmers normally do when they have a case statement that's
> 30 or more items long?  Previously I've just left it as a case statement
> and spent the life of the project ticked at it.
>

Ordinarily I think I'd consider it a code smell (or maybe a "design
smell"?). Maybe I'd fix it, maybe not, but like you I'd at least _want_
to :).
8498263b5390edb6749881639c9374d1?d=identicon&s=25 Alec Ross (Guest)
on 2006-02-02 11:24
(Received via mailing list)
In message <2548bfcd61b269091ed82e31994ade42@ruby-forum.com>, Michael
Judge <mjudge@surveycomplete.com> writes
>> Not necessarily better, but how about something like:
>>      end
>30 or more items long?  Previously I've just left it as a case statement
>and spent the life of the project ticked at it.
>

My inclination is generally to drive the execution by table lookup.
Af402d2ca294e8dd69f1c064e4a3b71a?d=identicon&s=25 Harley Pebley (Guest)
on 2006-02-06 18:28
(Received via mailing list)
>What do programmers normally do when they have a case statement that's
>30 or more items long?  Previously I've just left it as a case statement
>and spent the life of the project ticked at it.

I don't let them in the code in the first place. A situation that would
need a case/switch that's more than about 5 +/-2 item long gets
redesigned
during initial implementation.

When I take over a code-base that has something like that in it, it gets
redesigned the first time I have to touch that case statement.

Don't live with broken windows.

Regards,
Harley Pebley
This topic is locked and can not be replied to.