Intermittant problem with Hpricot

I’ve run into a wierd problem with the Hpricot gem. I don’t know if
this is a problem with gems in general on my system, or if it’s specific
to Hpricot, so I’m posting here.

Basically I have code as follows:
require ‘rubygems’
require ‘net/http’
require ‘uri’
require_gem ‘hpricot’

class Page
def some_function(html_text)
doc = Hpricot(html_text)
doc.search("/html/body//img").each do |img|
logger.debug("== Found an image: #{img} ==")
src = img[/src="/]
logger.debug(“src = #{src}”)
end
end
end

When the function is called, sometimes I get a message saying that
function Hpricot does not exist for the object of type Page. this
continues for the entire programming session.

But sometimes, when I sit down to work on the project, the code works!
It continues working until I restart the Rails server (this is a Rails
project running on MacOSX).

Does anyone have any idea what might be causing this problem, and how to
correct it?

Alli

Of interest, if I replace the
doc = HPricot(html_text)
by
doc = Hpricot.parse(html_text)

everything works just fine, all of the time. But I still don’t
understand why Rails doesn’t find the module level function all of the
time. Anyone have any ideas?

Alli

Allison N. wrote:

class Page

Alli

This might or might not be related, but I think there was some change to
the semantics of require_gem lately (appeared in a post or thread a week
or two ago?), or HPricot might have been changed not to load any files
when the gem is loaded.

Either way, (I think) you should just use #require, not #require_gem and
directly require files inside the Hpricot gem - Rubygems is intended to
be as transparent to use as possible, and at least personally I prefer
code agnostic to how a certain library might be packaged on the given
system.

Try using require ‘hpricot’ instead or require_gem ‘hpricot’ and see if
that helps?

Also, Rails has a hard dependency on rubygems, so you don’t need to
require those in Rails code.

David V.

David V. wrote:

This might or might not be related, but I think there was some change to
the semantics of require_gem lately (appeared in a post or thread a week
or two ago?), or HPricot might have been changed not to load any files
when the gem is loaded.

Either way, (I think) you should just use #require, not #require_gem and
directly require files inside the Hpricot gem - Rubygems is intended to
be as transparent to use as possible, and at least personally I prefer
code agnostic to how a certain library might be packaged on the given
system.

Try using require ‘hpricot’ instead or require_gem ‘hpricot’ and see if
that helps?

Also, Rails has a hard dependency on rubygems, so you don’t need to
require those in Rails code.

David V.

Yup, that seems to fix the problem, thanks. It’s still wierd that the
problem was intermittent though!!!

I think I may leave the code in the Hpricot.parse form, just to be sure,
until I have some time to more fully investigate the problem.

You can change the code one of two ways to get it to work

This way which is useful if you want, for example, to use a particular
version of the library

require ‘rubygems’
require ‘net/http’
require ‘uri’
require_gem ‘hpricot’, ‘=0.4’
require ‘hpricot’

or this way to use the latest version

require ‘net/http’
require ‘uri’
require ‘hpricot’

Luis

[email protected] wrote:

You can change the code one of two ways to get it to work

This way which is useful if you want, for example, to use a particular
version of the library

require ‘rubygems’
require ‘net/http’
require ‘uri’
require_gem ‘hpricot’, ‘=0.4’
require ‘hpricot’

or this way to use the latest version

require ‘net/http’
require ‘uri’
require ‘hpricot’

Luis

Ahhh, ok, so I have to require ‘hpricot’ regardless of whether I have
already done a require_gem ‘hpricot’. Seems a little odd that! I’ll
have to add it to my list of “not principle of least surprise” examples
:slight_smile: Thankyou

On 12/3/06, David V. [email protected] wrote:

already done a require_gem ‘hpricot’. Seems a little odd that! I’ll
have to add it to my list of “not principle of least surprise” examples
:slight_smile: Thankyou

Not really, the two methods aren’t supposed to do the same at all.

“Principle of least surprise” just means that things are consistent,
so that once you’re familiar with Ruby API X, you don’t have to
completely rethink your assumptions to work with Ruby API Y. The
language is supposed to conform to your intuition, but only after
you’ve developed that intuition – which you do by learning the
language. It’s much more about internal consistency and sensible
design than creating an effortless learning curve.

Allison N. wrote:

require ‘hpricot’
already done a require_gem ‘hpricot’. Seems a little odd that! I’ll
have to add it to my list of “not principle of least surprise” examples
:slight_smile: Thankyou

Not really, the two methods aren’t supposed to do the same at all.

require_gem means load a gem with the name “hpricot”, and then load the
library (usually .rb) files the gem’s author presumes you’ll want.
However, this is optional, and the gem’s author can decide not to load
any files in the gem by default, and then require_gem only serves to
determine the correct version. On the contrary, it is a little weird to
me that the decision of what code from a gem you want to use is up to
its authors and not its users, and I’m not aware of there being a
“prefer this version of a gem” method that’s strictly orthogonal to
“load this library”. But that’s digressing.

Rubygems modifies the semantics of require so that it also searches in
installed gems for a file you’re trying to find. The require ‘hpricot’
doesn’t try to load a gem named hpricot, it tries to load file
‘hpricot.rb’ whether it’s part of gems installed on the machine, or in
the “standard” library locations. The fact the gem names are usually
identical to names of files inside them you want to load makes this
difference hard to spot.

This manifests itself with the FOX bindings, for example: the gem name
is ‘fxruby’, but the file to load is ‘fox16.so’, and the gem doesn’t
automatically load it. To wit, an irb session demonstrating the
behaviour:

irb(main):001:0> $VERBOSE = nil
=> nil
irb(main):002:0> require ‘rubygems’
=> true
irb(main):003:0> require ‘fxruby’
LoadError: no such file to load – fxruby
from
C:/Ruby/lib/ruby/site_ruby/1.8/rubygems/custom_require.rb:27:in
gem_original_require' from C:/Ruby/lib/ruby/site_ruby/1.8/rubygems/custom_require.rb:27:inrequire’
from (irb):3
from C:/Ruby/lib/ruby/1.8/time.rb:65
irb(main):004:0> require_gem ‘fxruby’
=> true
irb(main):005:0> Fox
NameError: uninitialized constant Fox
from (irb):5
from C:/Ruby/lib/ruby/1.8/time.rb:65
irb(main):006:0> require ‘fox16’
=> true
irb(main):007:0> Fox
=> Fox

David V.

_why wrote:

You are talking about “autorequire”, David. Which is long gone.[1]

[1] http://redhanded.hobix.com/inspect/autorequireIsBasicallyGoneEveryone.html

Ah, that’s the change to rubygems I meant. Woohoo, I wasn’t
hallucinating :slight_smile: Thanks!

David V.

On Mon, Dec 04, 2006 at 04:06:58AM +0900, David V. wrote:

library (usually .rb) files the gem’s author presumes you’ll want.
You are talking about “autorequire”, David. Which is long gone.[1]

In other words, require_gem should no longer be used to load any Ruby
code.
It should only be used to tell RubyGems which version you’ll be needing.
You know?

_why

[1]
http://redhanded.hobix.com/inspect/autorequireIsBasicallyGoneEveryone.html