Forum: Ruby Hpricot ri and rdoc documentation

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
Feee221f9eb7818d90625ea141bfd60c?d=identicon&s=25 bbiker (Guest)
on 2006-12-25 00:00
(Received via mailing list)
I have gem installed Hpricot-0.4 (mswin32) but no ri documentation  or
rdoc documentation is installed along the library.

I have also tried the following gem command: gem install hpricot --ri
--rdoc with the same results.

These are the responses I obtain with ri and fri

c:\ruby>ri Hpricot
--------------------------------------------------------- Class:
Hpricot
     (no description...)
------------------------------------------------------------------------


c:\ruby>fri Hpricot
--------------------------------------------------------- Class:
Hpricot
     (no description...)
------------------------------------------------------------------------

Note: If mechanize (0.6.3) is not installed then

fri responds with an error message essentially stating that  it cannot
find a file supposedly located in a folder within the mechanize-0.6.3
folder.

c:\ruby>fri Hpricot
(druby://127.0.0.1:1181)
c:/ruby/lib/ruby/gems/1.8/gems/fastri-0.2.1.1/lib/fastri/ri_index.rb:354:in
`initialize': No such file or directory -
c:/ruby/lib/ruby/gems/1.8/doc/mechanize-0.6.3/ri/Hpricot/cdesc-Hpricot.yaml
(Errno::ENOENT)

I really need all the help that I can get to effectively use Hpricot. I
have look at _why website  but the specific information I need
is not there ... but 3/4 of the terms are strangers to me ... yes I am
a newbie to html and a recent convert to Ruby.

My question is: where and how do I obtain the ri and rdoc documentation
for Hpricot?
and do I install the documenation so that fri can find it

Thank you for your assistance
58479f76374a3ba3c69b9804163f39f4?d=identicon&s=25 Eric Hodel (Guest)
on 2006-12-25 00:38
(Received via mailing list)
On Dec 24, 2006, at 15:00, bbiker wrote:

> I have gem installed Hpricot-0.4 (mswin32) but no ri documentation  or
> rdoc documentation is installed along the library.

That's because there is none:

$ gem spec hpricot | grep rdoc
has_rdoc: false
rdoc_options: []
extra_rdoc_files:

--
Eric Hodel - drbrain@segment7.net - http://blog.segment7.net

I LIT YOUR GEM ON FIRE!
Feee221f9eb7818d90625ea141bfd60c?d=identicon&s=25 bbiker (Guest)
on 2006-12-25 02:28
(Received via mailing list)
Eric Hodel wrote:
> That's because there is none:
>
> $ gem spec hpricot | grep rdoc
> has_rdoc: false
> rdoc_options: []
> extra_rdoc_files:
>
Apparently there was(is) rdoc documentation ... googling "hpricot +
rdoc"
yielded the following link:
www.gemjack.com/gems/hpricot-0.4-mswin32/index.html

I guess I will try to find a gem depository that has the hpricot gem
with documentation.

I would really like to obtain the ri documentation since it is usually
more verbose.

Thanks for your response, Eric
Fc784eadb3b54531fdc3d2053db6f83f?d=identicon&s=25 Mat Schaffer (Guest)
on 2006-12-25 04:32
(Received via mailing list)
On 12/24/06, bbiker <renard@nc.rr.com> wrote:
> yielded the following link:
> www.gemjack.com/gems/hpricot-0.4-mswin32/index.html
>
> I guess I will try to find a gem depository that has the hpricot gem
> with documentation.
>
> I would really like to obtain the ri documentation since it is usually
> more verbose.
>
> Thanks for your response, Eric
>

I've often wondered the same thing.  The documentation for hpricot
seems a little sparse and basically limited to the wiki-like docs that
_why set up.  Good hunting!  Let us know if you find out anything
useful.  Or maybe _why can provide some insight.
-Mat
96931bfe0c2948f47a98e15ae52e5637?d=identicon&s=25 Chris Carter (cdcarter)
on 2006-12-25 04:43
(Received via mailing list)
bbiker, there are no rdocs/ri, just the wiki, the linked rdocs were
made by gemjack.
58479f76374a3ba3c69b9804163f39f4?d=identicon&s=25 Eric Hodel (Guest)
on 2006-12-25 04:51
(Received via mailing list)
On Dec 24, 2006, at 17:28, bbiker wrote:
> yielded the following link:
> www.gemjack.com/gems/hpricot-0.4-mswin32/index.html

has_rdoc is set to false by the author when they've got nothing
helpful in their project.  Forcing rdoc to generate won't get you
much useful information.

> I guess I will try to find a gem depository that has the hpricot gem
> with documentation.

There is no such thing.  RDoc is generated after installation and is
not shipped with gems.

> I would really like to obtain the ri documentation since it is usually
> more verbose.

The best way of doing this is downloading the hpricot sources, add
the RDoc to them, then submit patches back.

Anything else will leave you with little other than method names and
parameter names.

--
Eric Hodel - drbrain@segment7.net - http://blog.segment7.net

I LIT YOUR GEM ON FIRE!
Bf6862e2a409078e13a3979c00bba1d6?d=identicon&s=25 Gregory Seidman (Guest)
on 2006-12-25 05:01
(Received via mailing list)
On Mon, Dec 25, 2006 at 12:42:23PM +0900, Chris Carter wrote:
} bbiker, there are no rdocs/ri, just the wiki, the linked rdocs were
} made by gemjack.
[...]

I missed the original post of this thread, but I recommend that the
original poster just ask how to do what s/he is trying to do with
Hpricot.
I've been using with good success for a while, and I've mostly learned
it
from playing around in irb. I am willing to help if there is a specific
question.

Ideally someone (maybe even me) will get around to writing comprehensive
documentation for Hpricot. Until then, well, there are a few of us who
have
used it enough to answer questions helpfully, including _why himself.
He's
been known to take offhand suggestions to heart and implement them.

} Chris Carter
--Greg
Feee221f9eb7818d90625ea141bfd60c?d=identicon&s=25 bbiker (Guest)
on 2006-12-25 08:25
(Received via mailing list)
Gregory Seidman wrote:

>
Thank you for your offer of assistance

In simple term I need to extract text  from a <p ..... /p> element ...
yes I know that I can use "traverse_text" That works fine except when
there is no text at an expected location.

Here is an example of what I mean.   I am afraid that the <p ... /p> is
rather long and that it will get wrapped up in the wrong places.

<p class="MsoNormal"><span
style="font-size:11.0pt;font-family:Helvetica">W</span><span
style="font-size:9.0pt;font-family:Helvetica">EST</span><span
style="font-size:11.0pt;font-family:Helvetica"><br />
    <img src="Bermuda%20Bowl%20Final_files/s.gif" border="0"
id="_x0000_i1505" height="11" width="13" />8 7 6 2 <br />
    <img src="Bermuda%20Bowl%20Final_files/h.gif" border="0"
id="_x0000_i1506" height="11" width="13" /><br />
    <img src="Bermuda%20Bowl%20Final_files/d.gif" border="0"
id="_x0000_i1507" height="11" width="13" />K 7 6 4 <br />
    <img src="Bermuda%20Bowl%20Final_files/c.gif" border="0"
id="_x0000_i1508" height="11" width="13" />Q 7 6 3 2
<o:p></o:p></span></p>

Notice that after each <img ... /> element there is some text except
after the element with
h.gif file.  Now it is perfectly alright to have no text. The problem
is that I need to know which img element is not followed by some text
since it could in any of the four locations, In fact, there could more
than one occurrence of no text.

What I would like to do is insert the basename of src filename followed
by a space after the img element

Thus the text strings would read  "s 8 7 6 2", "h ", "d K 7 6 4", "c Q
7 6 3 2"
as opposed to "8 7 6 2", "K 7 6 4", "Q 7 6 3 2"

An alternate way would be to simply insert arbritary text ('x ")  after
the img element... no need to extract the basename. .. the s h d c
sequence is constant.

This would let traverse_text yield "x 8 7 6 2", "x ", "x K 7 6 4", "x Q
7 6 3 2"
A single character (x) would also work.
Actually, I prefer the alternate way...Occam's Razor

I know that I can do both using regexp. but I am trying to learn and
understand hpricot.
I understand that hpricot can modify (edit), remove, and add elements.
It is definitely stronger and faster. It also offers a more generic way
of processing html files.

So in simple terms, the problem is how to insert text after an element.
Bf6862e2a409078e13a3979c00bba1d6?d=identicon&s=25 Gregory Seidman (Guest)
on 2006-12-25 15:23
(Received via mailing list)
On Mon, Dec 25, 2006 at 04:24:05PM +0900, bbiker wrote:
[...]
} Thank you for your offer of assistance

Sure.

} In simple term I need to extract text  from a <p ..... /p> element ...
} yes I know that I can use "traverse_text" That works fine except when
} there is no text at an expected location.
[...]
} An alternate way would be to simply insert arbritary text ('x ")
after
} the img element... no need to extract the basename. .. the s h d c
} sequence is constant.
}
} This would let traverse_text yield "x 8 7 6 2", "x ", "x K 7 6 4", "x
Q
} 7 6 3 2"
} A single character (x) would also work.
} Actually, I prefer the alternate way...Occam's Razor

You got it.

} I know that I can do both using regexp. but I am trying to learn and
} understand hpricot.
} I understand that hpricot can modify (edit), remove, and add elements.
} It is definitely stronger and faster. It also offers a more generic
way
} of processing html files.
}
} So in simple terms, the problem is how to insert text after an
element.

Here's some code that should help you on your way. You'll notice that
I'm
avoiding adding methods to Hpricot classes and instead pulling out the
extra methods into modules and calling extend on the appropriate
objects.
Assuming you've loaded a document into the variable doc,
doc.extend(Hpricot::InsertTextExtension).insert_text_after('x ', 'img')
will make the changes you want:

require 'rubygems'
require 'hpricot'

module Hpricot::HandyElementTraversal
  def next_sibling
    i = index
    i && pchildren[i+1]
  end

  def index(elem = nil)
    elem ? children.index(elem) : (pchildren && pchildren.index(self))
  end

  def pchildren
    parent && parent.children
  end

end

module Hpricot::InsertTextExtension
  def insert_text_after(text, path)
    (self/path).each do |elem|
      elem.extend(Hpricot::HandyElementTraversal)
      sib = elem.next_sibling
      if Hpricot::Text === sib
        sib.content.insert(0, text)
      else
        elem.pchildren.insert(elem.index + 1, Hpricot::Text.new(text))
      end
    end
    self
  end
end

--Greg
Feee221f9eb7818d90625ea141bfd60c?d=identicon&s=25 bbiker (Guest)
on 2006-12-25 18:46
(Received via mailing list)
Gregory Seidman wrote:
> } An alternate way would be to simply insert arbritary text ('x ")  after
> } I know that I can do both using regexp. but I am trying to learn and
> Assuming you've loaded a document into the variable doc,
> doc.extend(Hpricot::InsertTextExtension).insert_text_after('x ', 'img')
> will make the changes you want:
>
== CODE removed to save bandwidth

Thank you very much for your assistance. I have just read the code and
it seems to do exactly what I want .. I have not yet tested it out. I
must admit that I do not understand the details but I will study the
code to understand exactly the "how" it does .. I know "what" it does.

I will let you know how it actually worked out. Let me say that I would
have been unable to come up with your solution.

So Greg once again, THANK YOU VERY MUCH  ... Happy Holidays
Feee221f9eb7818d90625ea141bfd60c?d=identicon&s=25 bbiker (Guest)
on 2006-12-26 09:01
(Received via mailing list)
bbiker wrote:
>
> Thank you very much for your assistance. I have just read the code and
> it seems to do exactly what I want .. I have not yet tested it out. I
> must admit that I do not understand the details but I will study the
> code to understand exactly the "how" it does .. I know "what" it does.
>
> I will let you know how it actually worked out. Let me say that I would
> have been unable to come up with your solution.
>
> So Greg once again, THANK YOU VERY MUCH  ... Happy Holidays

Greg, this is to let you know that the code worked right out of the
box. Just cut and pasted the code. Invoked the method and it did
exactly what I wanted.

Thank You and Have A Happy New Year
19fdf8bd123216b5056fb856cf1a5771?d=identicon&s=25 _why (Guest)
on 2006-12-26 18:21
(Received via mailing list)
On Mon, Dec 25, 2006 at 12:50:12PM +0900, Eric Hodel wrote:
> >rdoc"
> >yielded the following link:
> >www.gemjack.com/gems/hpricot-0.4-mswin32/index.html
>
> has_rdoc is set to false by the author when they've got nothing
> helpful in their project.  Forcing rdoc to generate won't get you
> much useful information.

It will with Hpricot.  I have added quite a bit of RDoc within the last
month, but haven't set that flag yet.  It will be set with the next
release.
(To anyone at all: Help test!)

_why
Feee221f9eb7818d90625ea141bfd60c?d=identicon&s=25 bbiker (Guest)
on 2007-01-31 05:00
(Received via mailing list)
On Dec 26 2006, 12:21 pm, _why <w...@ruby-lang.org> wrote:
> > >Apparently there was(is) rdoc documentation ... googling "hpricot+
> (To anyone at all: Help test!)
>
> _why- Hide quoted text -
>
> - Show quoted text -

I have downloaded hpricot 0.4.99 and get the following warning when
running my script.

c:/ruby/lib/ruby/gems/1.8/gems/hpricot-0.4.99-mswin32/lib/hpricot/
parse.rb:44: warning: instance variable @buffer_size not initialized

My script completed its task without any problem.

It seems that the RDoc flag has not yet been ... a newby like me sure
needs the documentation.

bbiker
renard@nc.rr.com
3f91cf60c92b20940674ebdeb46f6582?d=identicon&s=25 William Smith (Guest)
on 2007-01-31 06:16
(Received via mailing list)
I am glad they made this change. In an earlier version (i.e. a few days
ago)
if the page you were scraping was too big you were SOL, hpricot would
crash
and that was the end of it. Now you can either define a buffer size or
if
one isn't defined and it overflows the defined buff size it starts a new
buffer. The source for hrpicot is pretty easy to browse. I'd highly
recommend it.
19fdf8bd123216b5056fb856cf1a5771?d=identicon&s=25 _why (Guest)
on 2007-01-31 06:18
(Received via mailing list)
On Wed, Jan 31, 2007 at 01:00:08PM +0900, bbiker wrote:
> I have downloaded hpricot 0.4.99 and get the following warning when
> running my script.
>
> c:/ruby/lib/ruby/gems/1.8/gems/hpricot-0.4.99-mswin32/lib/hpricot/
> parse.rb:44: warning: instance variable @buffer_size not initialized

Thank you.

> It seems that the RDoc flag has not yet been ... a newby like me sure
> needs the documentation.

Are you positively sure?

_why
This topic is locked and can not be replied to.