How to extract some info from a tag in html

Sadaf_N · January 8, 2016, 5:00am

Hi all,

I use hpricot to extract some info from some webpages.

But I don’t find examples on extract attributes from a tag. Any help
will be appreciated.

Here is one example:

How can I extract the attribute of data_id and return its value
“247096”?

alex-osu3 · January 9, 2016, 9:58am

github.com

hpricot/hpricot/blob/master/README.md

# Hpricot is over.

After years of lack of a proper maintainer for one of why's jewels, it has been
decided to finally close the book on hpricot. Most users have migrated to alternatives
and there is simply no time or energy to continue with the current codebase.

If you feel that you have the time and wish to take it over, I suggest you instead
think about making the hpricot-like API within nokogiri 100% compatible, that is a better
use of your time.

But if you still feel like "No damnit, I wanna work on hpricot itself still!" then fork
this repo and start work. Send @evanphx or @nicksieger a message if you feel like you
want to take over the gem name with new releases under the hpricot name.

Thanks to \_why for all the fun. We'll never forget it.

## Now back to your original README content...


# Hpricot, Read Any HTML

This file has been truncated. show original

doc.at(“body”)[‘onload’]

The above code will find the body tag and give you back the onload
attribute. This is the most common reason to use the element directly:
when reading and writing HTML attributes.

More importantly, hpricot is deprecated:

Hpricot is over.
After years of lack of a proper maintainer for one of why’s jewels, it
has been decided to finally close the book on hpricot. Most users have
migrated to alternatives and there is simply no time or energy to
continue with the current codebase.

Try nokogiri: Installing Nokogiri - Nokogiri

alex-osu3 · January 15, 2016, 10:28pm

Thanks.
I will try it later.

alex-osu3 · January 21, 2016, 12:18am

Hi Dansei,

I try to grab info from a website using my password and username.

What is the syntax for providing nokogiri with my password and my
username?

Here is the script I use to get access to the website:

webpage=https://xexample.com

open(webpage,
:http_basic_authentication=>[‘my_user_name’,‘my_password’])
doc=Nokogiri::HTML(open(webpage))

I get this info return:
C:/Ruby21/lib/ruby/2.1.0/net/http.rb:923:in `connect’: SSL_connect
returned=1 errno=0 state=SSLv3 read server certificate B: certificate
verify failed (OpenSSL::SSL::SSLError)

How to fix it?

Thanks.