HTML extraction using ruby

arunvoip · March 13, 2009, 6:31am

Hi,
Can anybody tell me how to extract all the hyperlinks given in the
url:http://scores.sify.com/match/archive/archive.shtml using ruby. I
want the those urls having the class ‘com_blue com_size12 com_arial12’.
ie. to be more precise
, these are the type
of urls i want to have.
Please help. I’ll be really greatfull.

Regards,
Arun K. .C.M.

arunvoip · March 13, 2009, 3:36pm

Arun K. wrote:

Hi,
Can anybody tell me how to extract all the hyperlinks given in the
url:http://scores.sify.com/match/archive/archive.shtml using ruby. I
want the those urls having the class ‘com_blue com_size12 com_arial12’.
ie. to be more precise
, these are the type
of urls i want to have.

Use doc = Nokogiri::HTML( my_html ), then something like
doc.css(‘a.com_blue’).each

arunvoip · March 14, 2009, 4:43am

Phlip wrote:

Arun K. wrote:

Hi,
Can anybody tell me how to extract all the hyperlinks given in the
url:http://scores.sify.com/match/archive/archive.shtml using ruby. I
want the those urls having the class ‘com_blue com_size12 com_arial12’.
ie. to be more precise
, these are the type
of urls i want to have.

Use doc = Nokogiri::HTML( my_html ), then something like
doc.css(‘a.com_blue’).each

Sorry that doesn’t work. Showing an error like this.

uninitialized constant Nokogiri (NameError)

Regards
Arun K.

arunvoip · March 14, 2009, 5:20am

Arun K. wrote:

Sorry that doesn’t work. Showing an error like this.

uninitialized constant Nokogiri (NameError)

You are going to need to learn more Ruby before asking high-level
questions
about it.

What did Google tell you about Nokogiri, or RubyGems?