Hi all,
I use hpricot to extract some info from some webpages.
But I don’t find examples on extract attributes from a tag. Any help
will be appreciated.
Here is one example:
How can I extract the attribute of data_id and return its value
“247096”?
https://github.com/hpricot/hpricot/blob/master/README.md
doc.at(“body”)[‘onload’]
The above code will find the body tag and give you back the onload
attribute. This is the most common reason to use the element directly:
when reading and writing HTML attributes.
More importantly, hpricot is deprecated:
Hpricot is over.
After years of lack of a proper maintainer for one of why’s jewels, it
has been decided to finally close the book on hpricot. Most users have
migrated to alternatives and there is simply no time or energy to
continue with the current codebase.
Try nokogiri: http://www.nokogiri.org/tutorials/installing_nokogiri.html
Thanks.
I will try it later.
Hi Dansei,
I try to grab info from a website using my password and username.
What is the syntax for providing nokogiri with my password and my
username?
Here is the script I use to get access to the website:
webpage=https://xexample.com
open(webpage,
:http_basic_authentication=>[‘my_user_name’,‘my_password’])
doc=Nokogiri::HTML(open(webpage))
I get this info return:
C:/Ruby21/lib/ruby/2.1.0/net/http.rb:923:in `connect’: SSL_connect
returned=1 errno=0 state=SSLv3 read server certificate B: certificate
verify failed (OpenSSL::SSL::SSLError)
How to fix it?
Thanks.