Forum: Ruby Parsing text with regular expression

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
Ebac0be1687975b63a20da7b254442ab?d=identicon&s=25 Sebastian probst Eide (sebastianpe)
on 2007-04-29 22:42
Hi
I am writing a class that parses text. It checks each word and counts
how many times they occur in the text. It also checks for 'special'
words, that being words that are capitalized, all upper case or in mixed
case, and ads a flag to those words and checks that the words that are
not special fulfill a certain length requirement. The information is
stored in a hash like this:

{'word' => {:count => 1, :special => false}, 'other_word' => {:count=>
3, :special => true}}

Everything is working fine so far. The thing I am struggling to
implement though is the following:
I want to be able to check the context the 'special' words are in to see
if a capitalized special word maybe only is capitalized because it is
the first word in a new sentence or something like that.

I thought I could check by looking for something like this:

text =~ /[[:punct:]]\s?WORD_I_AM_LOOKING_FOR/
and if I got something else than 0 as a result it would mean that the
word is in the beginning of a sentence. But how do I insert a variable
into the regular expression? Or is there a different much cleverer way
to do this sort of check?

Currently I am scanning for each word like this:

_inn.scan(/\w{2,}[-\w]?/i) do |word|
  ...
end

and then doing the checking of the words inside that iterator.

Hope you have understood my problem and that you can point me in the
right direction.

best regards
Sebastian
3afd3e5e05dc9310c89aa5762cc8dd1d?d=identicon&s=25 Timothy Hunter (Guest)
on 2007-04-29 22:52
(Received via mailing list)
Sebastian probst Eide wrote:
> I thought I could check by looking for something like this:
>
> text =~ /[[:punct:]]\s?WORD_I_AM_LOOKING_FOR/
> and if I got something else than 0 as a result it would mean that the
> word is in the beginning of a sentence. But how do I insert a variable
> into the regular expression?
Use #{}, like this

word = "hello"

test =~ /[[:punct:]]\s?#{word}/

"word" can be any regular expression.
Ebac0be1687975b63a20da7b254442ab?d=identicon&s=25 Sebastian probst Eide (sebastianpe)
on 2007-04-29 22:54
Timothy Hunter wrote:
> Sebastian probst Eide wrote:
>> I thought I could check by looking for something like this:
>>
>> text =~ /[[:punct:]]\s?WORD_I_AM_LOOKING_FOR/
>> and if I got something else than 0 as a result it would mean that the
>> word is in the beginning of a sentence. But how do I insert a variable
>> into the regular expression?
> Use #{}, like this
>
> word = "hello"
>
> test =~ /[[:punct:]]\s?#{word}/
>
> "word" can be any regular expression.

Huh... that was the first thing I tried... must have done something else
wrong too in the same expression because it didn't work... I'll try
again.
Thanks Timothy

Sebastian
This topic is locked and can not be replied to.