Formal and stochastic grammars in ruby

Hello,

Given a string,
“gcgggcgggccgcggaaattacta” it may so happen that the gcgg substring is
very important and it is repeated in strings of this nature.

I would like to replace all occurrences of “gcgg” with a new symbol call
it B such that I can reduce the initial text to BggcgggccBaaattacta.

While i know it is possible to accomplish that using regular expressions
and substitution,my wider aim is to develop a grammar for such a
string,using a finite alphabet for example {A,G,C,T} in this case. Where
A,G,C,T are terminal symbols and B is a nonterminal symbol. Ideally I
would like to have a set of Production rules for generating such
strings.

How easy or relevant is it to use ruby for deriving stochastic grammars
and linguistics? Are there existing libraries for doing such things in
ruby?

I am new to the world of grammar and stuff so please bear with me

Thank you.

George

George G. wrote:

Hello,

Given a string,
“gcgggcgggccgcggaaattacta” it may so happen that the gcgg substring is
very important and it is repeated in strings of this nature.

I would like to replace all occurrences of “gcgg” with a new symbol call
it B such that I can reduce the initial text to BggcgggccBaaattacta.

While i know it is possible to accomplish that using regular expressions
and substitution,my wider aim is to develop a grammar for such a
string,using a finite alphabet for example {A,G,C,T} in this case. Where
A,G,C,T are terminal symbols and B is a nonterminal symbol. Ideally I
would like to have a set of Production rules for generating such
strings.

How easy or relevant is it to use ruby for deriving stochastic grammars
and linguistics? Are there existing libraries for doing such things in
ruby?

I am new to the world of grammar and stuff so please bear with me

Thank you.

George

Well, I don’t know a whole lot about stochastic grammars in Ruby, but
assuming your choice of terminal symbols was not wholly coincidental,
you may find BioRuby [1] helpful.

If nucleotides really were just an arbitrary example and you’re actually
looking for something broader, then, someone else here may have a
suggestion.

[1] BioRuby: http://bioruby.org/

Well, I don’t know a whole lot about stochastic grammars in Ruby, but
assuming your choice of terminal symbols was not wholly coincidental,
you may find BioRuby [1] helpful.

If nucleotides really were just an arbitrary example and you’re actually
looking for something broader, then, someone else here may have a
suggestion.

[1] BioRuby: http://bioruby.org/

Hi,Thanks for the reply, Bioruby unfortunately does not implement any of
the stochastic or any grammars yet. While i used the example of
nucleotide sequences, i am actually looking for something broader,
though the nucleotide sequences problem would be a good start point.