Hpricot ri and rdoc documentation

I have gem installed Hpricot-0.4 (mswin32) but no ri documentation or
rdoc documentation is installed along the library.

I have also tried the following gem command: gem install hpricot --ri
–rdoc with the same results.

These are the responses I obtain with ri and fri

c:\ruby>ri Hpricot
--------------------------------------------------------- Class:
Hpricot
(no description…)

c:\ruby>fri Hpricot
--------------------------------------------------------- Class:
Hpricot
(no description…)

Note: If mechanize (0.6.3) is not installed then

fri responds with an error message essentially stating that it cannot
find a file supposedly located in a folder within the mechanize-0.6.3
folder.

c:\ruby>fri Hpricot
(druby://127.0.0.1:1181)
c:/ruby/lib/ruby/gems/1.8/gems/fastri-0.2.1.1/lib/fastri/ri_index.rb:354:in
`initialize’: No such file or directory -
c:/ruby/lib/ruby/gems/1.8/doc/mechanize-0.6.3/ri/Hpricot/cdesc-Hpricot.yaml
(Errno::ENOENT)

I really need all the help that I can get to effectively use Hpricot. I
have look at _why website but the specific information I need
is not there … but 3/4 of the terms are strangers to me … yes I am
a newbie to html and a recent convert to Ruby.

My question is: where and how do I obtain the ri and rdoc documentation
for Hpricot?
and do I install the documenation so that fri can find it

Thank you for your assistance

On Dec 24, 2006, at 15:00, bbiker wrote:

I have gem installed Hpricot-0.4 (mswin32) but no ri documentation or
rdoc documentation is installed along the library.

That’s because there is none:

$ gem spec hpricot | grep rdoc
has_rdoc: false
rdoc_options: []
extra_rdoc_files:


Eric H. - [email protected] - http://blog.segment7.net

I LIT YOUR GEM ON FIRE!

Eric H. wrote:

That’s because there is none:

$ gem spec hpricot | grep rdoc
has_rdoc: false
rdoc_options: []
extra_rdoc_files:

Apparently there was(is) rdoc documentation … googling “hpricot +
rdoc”
yielded the following link:
www.gemjack.com/gems/hpricot-0.4-mswin32/index.html

I guess I will try to find a gem depository that has the hpricot gem
with documentation.

I would really like to obtain the ri documentation since it is usually
more verbose.

Thanks for your response, Eric

bbiker, there are no rdocs/ri, just the wiki, the linked rdocs were
made by gemjack.

On 12/24/06, bbiker [email protected] wrote:

yielded the following link:
www.gemjack.com/gems/hpricot-0.4-mswin32/index.html

I guess I will try to find a gem depository that has the hpricot gem
with documentation.

I would really like to obtain the ri documentation since it is usually
more verbose.

Thanks for your response, Eric

I’ve often wondered the same thing. The documentation for hpricot
seems a little sparse and basically limited to the wiki-like docs that
_why set up. Good hunting! Let us know if you find out anything
useful. Or maybe _why can provide some insight.
-Mat

On Dec 24, 2006, at 17:28, bbiker wrote:

yielded the following link:
www.gemjack.com/gems/hpricot-0.4-mswin32/index.html

has_rdoc is set to false by the author when they’ve got nothing
helpful in their project. Forcing rdoc to generate won’t get you
much useful information.

I guess I will try to find a gem depository that has the hpricot gem
with documentation.

There is no such thing. RDoc is generated after installation and is
not shipped with gems.

I would really like to obtain the ri documentation since it is usually
more verbose.

The best way of doing this is downloading the hpricot sources, add
the RDoc to them, then submit patches back.

Anything else will leave you with little other than method names and
parameter names.


Eric H. - [email protected] - http://blog.segment7.net

I LIT YOUR GEM ON FIRE!

On Mon, Dec 25, 2006 at 12:42:23PM +0900, Chris C. wrote:
} bbiker, there are no rdocs/ri, just the wiki, the linked rdocs were
} made by gemjack.
[…]

I missed the original post of this thread, but I recommend that the
original poster just ask how to do what s/he is trying to do with
Hpricot.
I’ve been using with good success for a while, and I’ve mostly learned
it
from playing around in irb. I am willing to help if there is a specific
question.

Ideally someone (maybe even me) will get around to writing comprehensive
documentation for Hpricot. Until then, well, there are a few of us who
have
used it enough to answer questions helpfully, including _why himself.
He’s
been known to take offhand suggestions to heart and implement them.

} Chris C.
–Greg

Gregory S. wrote:

Thank you for your offer of assistance

In simple term I need to extract text from a <p … /p> element …
yes I know that I can use “traverse_text” That works fine except when
there is no text at an expected location.

Here is an example of what I mean. I am afraid that the <p … /p> is
rather long and that it will get wrapped up in the wrong places.

WEST
8 7 6 2

K 7 6 4
Q 7 6 3 2

Notice that after each <img … /> element there is some text except
after the element with
h.gif file. Now it is perfectly alright to have no text. The problem
is that I need to know which img element is not followed by some text
since it could in any of the four locations, In fact, there could more
than one occurrence of no text.

What I would like to do is insert the basename of src filename followed
by a space after the img element

Thus the text strings would read “s 8 7 6 2”, "h ", “d K 7 6 4”, “c Q
7 6 3 2”
as opposed to “8 7 6 2”, “K 7 6 4”, “Q 7 6 3 2”

An alternate way would be to simply insert arbritary text ('x ") after
the img element… no need to extract the basename. … the s h d c
sequence is constant.

This would let traverse_text yield “x 8 7 6 2”, "x ", “x K 7 6 4”, “x Q
7 6 3 2”
A single character (x) would also work.
Actually, I prefer the alternate way…Occam’s Razor

I know that I can do both using regexp. but I am trying to learn and
understand hpricot.
I understand that hpricot can modify (edit), remove, and add elements.
It is definitely stronger and faster. It also offers a more generic way
of processing html files.

So in simple terms, the problem is how to insert text after an element.

Gregory S. wrote:

} An alternate way would be to simply insert arbritary text ('x ") after
} I know that I can do both using regexp. but I am trying to learn and
Assuming you’ve loaded a document into the variable doc,
doc.extend(Hpricot::InsertTextExtension).insert_text_after('x ', ‘img’)
will make the changes you want:

== CODE removed to save bandwidth

Thank you very much for your assistance. I have just read the code and
it seems to do exactly what I want … I have not yet tested it out. I
must admit that I do not understand the details but I will study the
code to understand exactly the “how” it does … I know “what” it does.

I will let you know how it actually worked out. Let me say that I would
have been unable to come up with your solution.

So Greg once again, THANK YOU VERY MUCH … Happy Holidays

On Mon, Dec 25, 2006 at 04:24:05PM +0900, bbiker wrote:
[…]
} Thank you for your offer of assistance

Sure.

} In simple term I need to extract text from a <p … /p> element …
} yes I know that I can use “traverse_text” That works fine except when
} there is no text at an expected location.
[…]
} An alternate way would be to simply insert arbritary text ('x ")
after
} the img element… no need to extract the basename. … the s h d c
} sequence is constant.
}
} This would let traverse_text yield “x 8 7 6 2”, "x ", “x K 7 6 4”, “x
Q
} 7 6 3 2”
} A single character (x) would also work.
} Actually, I prefer the alternate way…Occam’s Razor

You got it.

} I know that I can do both using regexp. but I am trying to learn and
} understand hpricot.
} I understand that hpricot can modify (edit), remove, and add elements.
} It is definitely stronger and faster. It also offers a more generic
way
} of processing html files.
}
} So in simple terms, the problem is how to insert text after an
element.

Here’s some code that should help you on your way. You’ll notice that
I’m
avoiding adding methods to Hpricot classes and instead pulling out the
extra methods into modules and calling extend on the appropriate
objects.
Assuming you’ve loaded a document into the variable doc,
doc.extend(Hpricot::InsertTextExtension).insert_text_after('x ', ‘img’)
will make the changes you want:

require ‘rubygems’
require ‘hpricot’

module Hpricot::HandyElementTraversal
def next_sibling
i = index
i && pchildren[i+1]
end

def index(elem = nil)
elem ? children.index(elem) : (pchildren && pchildren.index(self))
end

def pchildren
parent && parent.children
end

end

module Hpricot::InsertTextExtension
def insert_text_after(text, path)
(self/path).each do |elem|
elem.extend(Hpricot::HandyElementTraversal)
sib = elem.next_sibling
if Hpricot::Text === sib
sib.content.insert(0, text)
else
elem.pchildren.insert(elem.index + 1, Hpricot::Text.new(text))
end
end
self
end
end

–Greg

On Mon, Dec 25, 2006 at 12:50:12PM +0900, Eric H. wrote:

rdoc"
yielded the following link:
www.gemjack.com/gems/hpricot-0.4-mswin32/index.html

has_rdoc is set to false by the author when they’ve got nothing
helpful in their project. Forcing rdoc to generate won’t get you
much useful information.

It will with Hpricot. I have added quite a bit of RDoc within the last
month, but haven’t set that flag yet. It will be set with the next
release.
(To anyone at all: Help test!)

_why

bbiker wrote:

Thank you very much for your assistance. I have just read the code and
it seems to do exactly what I want … I have not yet tested it out. I
must admit that I do not understand the details but I will study the
code to understand exactly the “how” it does … I know “what” it does.

I will let you know how it actually worked out. Let me say that I would
have been unable to come up with your solution.

So Greg once again, THANK YOU VERY MUCH … Happy Holidays

Greg, this is to let you know that the code worked right out of the
box. Just cut and pasted the code. Invoked the method and it did
exactly what I wanted.

Thank You and Have A Happy New Year

On Dec 26 2006, 12:21 pm, _why [email protected] wrote:

Apparently there was(is) rdoc documentation … googling "hpricot+
(To anyone at all: Help test!)

_why- Hide quoted text -

  • Show quoted text -

I have downloaded hpricot 0.4.99 and get the following warning when
running my script.

c:/ruby/lib/ruby/gems/1.8/gems/hpricot-0.4.99-mswin32/lib/hpricot/
parse.rb:44: warning: instance variable @buffer_size not initialized

My script completed its task without any problem.

It seems that the RDoc flag has not yet been … a newby like me sure
needs the documentation.

bbiker
[email protected]

I am glad they made this change. In an earlier version (i.e. a few days
ago)
if the page you were scraping was too big you were SOL, hpricot would
crash
and that was the end of it. Now you can either define a buffer size or
if
one isn’t defined and it overflows the defined buff size it starts a new
buffer. The source for hrpicot is pretty easy to browse. I’d highly
recommend it.

On Wed, Jan 31, 2007 at 01:00:08PM +0900, bbiker wrote:

I have downloaded hpricot 0.4.99 and get the following warning when
running my script.

c:/ruby/lib/ruby/gems/1.8/gems/hpricot-0.4.99-mswin32/lib/hpricot/
parse.rb:44: warning: instance variable @buffer_size not initialized

Thank you.

It seems that the RDoc flag has not yet been … a newby like me sure
needs the documentation.

Are you positively sure?

_why