Regexp html scraping


#1

Hi,
I’ve to extract the full html from a website url using regular
expressions or ‘net-http’. Can anybody help me with the code to extract
the full html content of a website. I need to use only regexp or
‘net:http’

Thanks
Arun K.


#2

Arun K. wrote:

Hi,
I’ve to extract the full html from a website url using regular
expressions or ‘net-http’. Can anybody help me with the code to extract
the full html content of a website. I need to use only regexp or
‘net:http’

require ‘net/http’

Net::HTTP.start(“www.google.com”) do |http|
resp = http.get("/")
puts resp.body[0…100]
end

–output:–

Google</ti

#3

2009/3/18 Arun K. removed_email_address@domain.invalid:

I’ve to extract the full html from a website url using regular
expressions or ‘net-http’.

What kind of question is that? Use net-http OR regular expressions -
I mean, both serve totally different purposes. You cannot exchange
one for the other. You’ll have difficulties to obtain the content
using regular expressions only…

Wondering…

robert