Forum: Ruby Boyer-Moore string search algorithm in ruby

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
E3640176765dae2465d33ef6eb114691?d=identicon&s=25 bcparanj@gmail.com (Guest)
on 2007-07-22 05:25
(Received via mailing list)
I came across this post:
http://www.arnebrasseur.net/2007/02/26/boyer-moore...

Does anyone know if this is already part of Ruby open source library?
I did not find anything on Google. TIA.
F1d6cc2b735bfd82c8773172da2aeab9?d=identicon&s=25 Nobuyoshi Nakada (Guest)
on 2007-07-22 11:22
(Received via mailing list)
Hi,

At Sun, 22 Jul 2007 12:25:01 +0900,
bcparanj@gmail.com wrote in [ruby-talk:261195]:
> Does anyone know if this is already part of Ruby open source library?
> I did not find anything on Google. TIA.

String#index uses Karp-Rabin algorithm, not Boyer-Moore.
0ec4920185b657a03edf01fff96b4e9b?d=identicon&s=25 Yukihiro Matsumoto (Guest)
on 2007-07-22 14:30
(Received via mailing list)
Hi,

In message "Re: Boyer-Moore string search algorithm in ruby"
    on Sun, 22 Jul 2007 18:21:18 +0900, "Nobuyoshi Nakada"
<nobu@ruby-lang.org> writes:

|bcparanj@gmail.com wrote in [ruby-talk:261195]:
|> Does anyone know if this is already part of Ruby open source library?
|> I did not find anything on Google. TIA.
|
|String#index uses Karp-Rabin algorithm, not Boyer-Moore.

1.8 regex engine uses modified Boyer-Moore search algorithm.

              matz.
B8cfd5ec0f88bf5b5f2eedda7d1a0746?d=identicon&s=25 Peter Seebach (Guest)
on 2007-07-22 20:06
(Received via mailing list)
In message
<cebd6fd10707220221g67c7c969hd7c022fa08cd091e@mail.gmail.com>,
"Nobuyoshi Nakada" wr
ites:
>At Sun, 22 Jul 2007 12:25:01 +0900,
>bcparanj@gmail.com wrote in [ruby-talk:261195]:
>> Does anyone know if this is already part of Ruby open source library?
>> I did not find anything on Google. TIA.

>String#index uses Karp-Rabin algorithm, not Boyer-Moore.

That strikes me as secondary; presumably, the goal is to implement it in
Ruby
code, not to determine what algorithm the language internals use, no?

-s
1c0cd550766a3ee3e4a9c495926e4603?d=identicon&s=25 John Joyce (Guest)
on 2007-07-23 05:04
(Received via mailing list)
On Jul 22, 2007, at 1:04 PM, Peter Seebach wrote:

>> String#index uses Karp-Rabin algorithm, not Boyer-Moore.
>
> That strikes me as secondary; presumably, the goal is to implement
> it in Ruby
> code, not to determine what algorithm the language internals use, no?
>
> -s
>
Perhaps if you look into the code for one of those functions, you
might find that it is done in Ruby, you might not.
B8cfd5ec0f88bf5b5f2eedda7d1a0746?d=identicon&s=25 Peter Seebach (Guest)
on 2007-07-23 05:08
(Received via mailing list)
In message <8E18FA11-568B-4331-9BC0-2FB4B6BFC687@gmail.com>, John Joyce
writes:
>Perhaps if you look into the code for one of those functions, you
>might find that it is done in Ruby, you might not.

True.

Normally, when someone asks for an algorithm implemented in a language,
a library-style implementation will be a poor fit, as most library
implementations are tuned for speed rather than readability -- perhaps
less true in Ruby, I admit.

-s
E14633fe004ce897045cfbdea278c1a7?d=identicon&s=25 Nate Murray (jashmenn)
on 2010-10-27 16:16
Hey all, I realize this thread is a few years old, but I just wrote a
Ruby library for doing Boyer-Moore string searching which you can find
here:
http://www.xcombinator.com/2010/10/27/boyer-moore-...
.

It's basically a port of the current wikipedia c-code, but it also
supports searching token arrays and the tokens can be regular
expressions. For instance:

Usage: BoyerMoore.search(haystack, needle)   # returns index of needle
or nil

Examples:

    BoyerMoore.search("ANPANMAN", "ANP")   # => 0
    BoyerMoore.search("ANPANMAN", "ANPXX") # => nil
    BoyerMoore.search(["<b>", "hi", "</b>"], ["hi"])         # => 1
    BoyerMoore.search(["bam", "foo", "bar"], ["foo", "bar"]) # => 1
    BoyerMoore.search(["bam", "bar", "baz"], ["foo"])        # => nil
    BoyerMoore.search(["Sing", "99", "Luftballon"], [/\d+/]) == 1
    BoyerMoore.search(["Nate Murray", "5 Pine Street", "Los Angeles",
"CA", "90210"], [/^\w{2}$/, /^\d{5}$/]) == 3

The regexp matching is fairly slow, so be wary of using it.

Hope this helps someone!

Nate
This topic is locked and can not be replied to.