Forum: Ruby matches -> regexp ?

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
A70b7da5a3a712e800100e61ef8d8917?d=identicon&s=25 ako... (Guest)
on 2005-12-12 21:34
(Received via mailing list)
Hello,

this is not exactly a ruby problem, i hope people will not consider
this spamming.

does anyone know of any reference or an algorithm to build a regular
expression (a finite automaton, in general) that would match a given
set of strings? i of course mean a regexp that would match just the
given strings, and not any other. so solutions like /^.*$/ are not
acceptable ;-)

thanks
konstantin
8aab717743d4f58a4453adb8b6855e1e?d=identicon&s=25 Robert Retzbach (Guest)
on 2005-12-12 21:37
(Received via mailing list)
ako... schrieb:

>
>thanks
>konstantin
>
>
>
>
>
nice challenge, reminds me a bit of the 4th rubyquiz :>
036a1b88dafaab8ffd73a8b0a74b5b38?d=identicon&s=25 Edward Faulkner (Guest)
on 2005-12-12 21:52
(Received via mailing list)
> ako... schrieb:
> >does anyone know of any reference or an algorithm to build a regular
> >expression (a finite automaton, in general) that would match a given
> >set of strings? i of course mean a regexp that would match just the
> >given strings, and not any other. so solutions like /^.*$/ are not
> >acceptable ;-)

This is essentially an information compression problem.  There's
always a trivial solution:

s = %w{find only these strings}
r = Regexp.new(s.map {|w| '\A' + w + '\Z'}.join('|'))

The _real_ problem is finding the shortest possible regexp. ;-)
A70b7da5a3a712e800100e61ef8d8917?d=identicon&s=25 ako... (Guest)
on 2005-12-12 21:58
(Received via mailing list)
well, right, that is what i mean.
A70b7da5a3a712e800100e61ef8d8917?d=identicon&s=25 ako... (Guest)
on 2005-12-12 22:04
(Received via mailing list)
i suspect that from your trivial solution we can arrive at the final
solution by just performing a DFA minimization. the same thing that lex
and yacc do when they generate code. so i guess i have answered my own
question. thanks for helping me. your trivial solution just made my
brain work.

konstantin
5befe95e6648daec3dd5728cd36602d0?d=identicon&s=25 Robert Klemme (Guest)
on 2005-12-13 14:19
(Received via mailing list)
ako... wrote:
> i suspect that from your trivial solution we can arrive at the final
> solution by just performing a DFA minimization. the same thing that
> lex and yacc do when they generate code. so i guess i have answered
> my own question. thanks for helping me. your trivial solution just
> made my brain work.

Another option is to build up a tree and create the RX from that.  I
once
wrote that:

14:12:23 [c]: ( echo foo; echo bar; echo band ) | create-rx.rb
(?:ba(?:nd|r)|foo)

Kind regards

    robert
This topic is locked and can not be replied to.