A few confusing Hpricot outputs. Anyone had similar experience?


#1

I wanted to work on something like the following example string

require ‘hpricot’
string = ‘posted on
April
2009

h = Hpricot(string)
t = “2009-04-06”

Here it goes: confusion No.1

h.at(‘a[@title*=“2009-04-06”]’)
##=> returns the 2nd anchor element, as expected.
h.at(‘a[@title*=Time.now.strftime("%Y-%m-%d")]’)
##=> 1st anchor element. Why is that??
h.at(“a[@title*=#{t}]”)
##=> 2nd anchor. works fine
h.at(‘a[@title*="#{t}"]’)
##=> nil. Because of the single quote?

And here comes another confusion:

year = “2009”
h.at(“a[@title*=#{t}][text()=‘2009’]")
##=> 2nd anchor, as expected.
h.at("a[@title
=#{t}][text()*=#{year}]”)
##=> nil. Why is that? Hpricot can’t handle #{} more than once?

Hope you can fill me in on this one. Thanks!!

##Jay


#2

On Mon, Apr 6, 2009 at 4:11 AM, Wang J. removed_email_address@domain.invalid wrote:

h.at(‘a[@title*=“2009-04-06”]’)
##=> returns the 2nd anchor element, as expected.
h.at(‘a[@title*=Time.now.strftime("%Y-%m-%d")]’)
##=> 1st anchor element. Why is that??

I’m not sure why it is returning {emptyelem }, but I can tell you
why its not returning the element you expect: because you didn’t use
string interpolation so that the call to Time.now.strftime(…) would
be evaluated and inserted into the string. This selects the expected
element:

h.at(“a[@title*=#{Time.now.strftime(’%Y-%m-%d’)}]”)

h.at(“a[@title*=#{t}]”)
##=> 2nd anchor. works fine
h.at(‘a[@title*="#{t}"]’)
##=> nil. Because of the single quote?

Exactly, that’s just ruby single- versus double-quote string behavior.
With the same setup as you used:

irb(main):037:0> “#{t}”
=> “2009-04-06”
irb(main):038:0> ‘#{t}’
=> “#{t}”

And here comes another confusion:

year = “2009”
h.at(“a[@title*=#{t}][text()=‘2009’]")
##=> 2nd anchor, as expected.
h.at("a[@title
=#{t}][text()*=#{year}]”)
##=> nil. Why is that? Hpricot can’t handle #{} more than once?

Do you mean for these to pass different strings to h.at()? Look at the
strings you are using.

irb(main):048:0> puts [ "a[@title*=#{t}][text()=‘2009’]",
irb(main):049:1
"a[@title*=#{t}][text()=#{year}]" ]
a[@title
=2009-04-06][text()=‘2009’]
a[@title
=2009-04-06][text()*=2009]

So, you are just getting unreliable results when you aren’t using
quotes around the values you are searching for. This version works,
where the second one above did not:

h.at(“a[@title*=’#{t}’][text()*=’#{year}’]”)

Note that I’ve put quotes on both values, though at least in this
example the title appears to work without them.


#3

Great notes. Thanks a lot!

So the take home message is like always use " on the very outside, and
use
(literally) ‘#{expression}’ to ensure consistency.

It’s kinda counter-intuitive at first look, as normally the #{} won’t
work
when placed in between single quotes. But it works in this one. :slight_smile:

2009/4/6 Christopher D. removed_email_address@domain.invalid