Boyer-Moore string search algorithm in ruby


#1

I came across this post:
http://www.arnebrasseur.net/2007/02/26/boyer-moore-string-search-algorithm-in-ruby/en/

Does anyone know if this is already part of Ruby open source library?
I did not find anything on Google. TIA.


#2

Hi,

At Sun, 22 Jul 2007 12:25:01 +0900,
removed_email_address@domain.invalid wrote in [ruby-talk:261195]:

Does anyone know if this is already part of Ruby open source library?
I did not find anything on Google. TIA.

String#index uses Karp-Rabin algorithm, not Boyer-Moore.


#3

Hi,

In message “Re: Boyer-Moore string search algorithm in ruby”
on Sun, 22 Jul 2007 18:21:18 +0900, “Nobuyoshi N.”
removed_email_address@domain.invalid writes:

|removed_email_address@domain.invalid wrote in [ruby-talk:261195]:
|> Does anyone know if this is already part of Ruby open source library?
|> I did not find anything on Google. TIA.
|
|String#index uses Karp-Rabin algorithm, not Boyer-Moore.

1.8 regex engine uses modified Boyer-Moore search algorithm.

          matz.

#4

In message
removed_email_address@domain.invalid,
“Nobuyoshi N.” wr
ites:

At Sun, 22 Jul 2007 12:25:01 +0900,
removed_email_address@domain.invalid wrote in [ruby-talk:261195]:

Does anyone know if this is already part of Ruby open source library?
I did not find anything on Google. TIA.

String#index uses Karp-Rabin algorithm, not Boyer-Moore.

That strikes me as secondary; presumably, the goal is to implement it in
Ruby
code, not to determine what algorithm the language internals use, no?

-s


#5

On Jul 22, 2007, at 1:04 PM, Peter S. wrote:

String#index uses Karp-Rabin algorithm, not Boyer-Moore.

That strikes me as secondary; presumably, the goal is to implement
it in Ruby
code, not to determine what algorithm the language internals use, no?

-s

Perhaps if you look into the code for one of those functions, you
might find that it is done in Ruby, you might not.


#6

In message removed_email_address@domain.invalid, John J.
writes:

Perhaps if you look into the code for one of those functions, you
might find that it is done in Ruby, you might not.

True.

Normally, when someone asks for an algorithm implemented in a language,
a library-style implementation will be a poor fit, as most library
implementations are tuned for speed rather than readability – perhaps
less true in Ruby, I admit.

-s


#7

Hey all, I realize this thread is a few years old, but I just wrote a
Ruby library for doing Boyer-Moore string searching which you can find
here:
http://www.xcombinator.com/2010/10/27/boyer-moore-string-search-algorithm-in-ruby/
.

It’s basically a port of the current wikipedia c-code, but it also
supports searching token arrays and the tokens can be regular
expressions. For instance:

Usage: BoyerMoore.search(haystack, needle) # returns index of needle
or nil

Examples:

BoyerMoore.search("ANPANMAN", "ANP")   # => 0
BoyerMoore.search("ANPANMAN", "ANPXX") # => nil
BoyerMoore.search(["<b>", "hi", "</b>"], ["hi"])         # => 1
BoyerMoore.search(["bam", "foo", "bar"], ["foo", "bar"]) # => 1
BoyerMoore.search(["bam", "bar", "baz"], ["foo"])        # => nil
BoyerMoore.search(["Sing", "99", "Luftballon"], [/\d+/]) == 1
BoyerMoore.search(["Nate M.", "5 Pine Street", "Los A.",

“CA”, “90210”], [/^\w{2}$/, /^\d{5}$/]) == 3

The regexp matching is fairly slow, so be wary of using it.

Hope this helps someone!

Nate