Forum: Ferret Ferret with Rdig - Indexing between tags

31c25d2bd9d13f3ff3e48fc64c7ec14e?d=identicon&s=25 Sébastien Mizrahi (slum)
on 2008-08-01 09:38
Hi,

I'm using Ferret and Rdig, and I'm trying to index HTML between tags
without success :
I just want to index data like this :
<! -- startToIndex -->
Here's my HTML code which I want to index
<!-- endToIndex -->

My code is the following :

 cfg.content_extraction = OpenStruct.new(

    # HPRICOT configuration
    # hpricot is the html parsing lib used by RDig. See
    # http://code.whytheluckystiff.net/hpricot for usage information.
    # Any code blocks given for content selection will receive an
Hpricot instance
    # containing the full page content when called.
    :hpricot      => OpenStruct.new(
      # css selector for the element containing the page title
      :title_tag_selector => 'title',
      # might also be a proc returning either an element or a string:
      # :title_tag_selector => lambda { |hpricot_doc| ... }
      :content_tag_selector => 'body'
      # might also be a proc returning either an element or a string:
      # :content_tag_selector => lambda { |hpricot_doc| ... }
    )
  )

Any help would be helpful :)

Best regards,
This topic is locked and can not be replied to.