Hello, this is not exactly a ruby problem, i hope people will not consider this spamming. does anyone know of any reference or an algorithm to build a regular expression (a finite automaton, in general) that would match a given set of strings? i of course mean a regexp that would match just the given strings, and not any other. so solutions like /^.*$/ are not acceptable ;-) thanks konstantin

on 2005-12-12 21:34

on 2005-12-12 21:37

ako... schrieb: > >thanks >konstantin > > > > > nice challenge, reminds me a bit of the 4th rubyquiz :>

on 2005-12-12 21:52

> ako... schrieb: > >does anyone know of any reference or an algorithm to build a regular > >expression (a finite automaton, in general) that would match a given > >set of strings? i of course mean a regexp that would match just the > >given strings, and not any other. so solutions like /^.*$/ are not > >acceptable ;-) This is essentially an information compression problem. There's always a trivial solution: s = %w{find only these strings} r = Regexp.new(s.map {|w| '\A' + w + '\Z'}.join('|')) The _real_ problem is finding the shortest possible regexp. ;-)

on 2005-12-12 22:04

i suspect that from your trivial solution we can arrive at the final solution by just performing a DFA minimization. the same thing that lex and yacc do when they generate code. so i guess i have answered my own question. thanks for helping me. your trivial solution just made my brain work. konstantin

on 2005-12-13 14:19

ako... wrote: > i suspect that from your trivial solution we can arrive at the final > solution by just performing a DFA minimization. the same thing that > lex and yacc do when they generate code. so i guess i have answered > my own question. thanks for helping me. your trivial solution just > made my brain work. Another option is to build up a tree and create the RX from that. I once wrote that: 14:12:23 [c]: ( echo foo; echo bar; echo band ) | create-rx.rb (?:ba(?:nd|r)|foo) Kind regards robert