I Need to Extract Img tag Using Regular Expressions From The Html Page
<\simg [^>]src\s=\s(["’])(.*?)\1
Is This Code Would be ok
Can Any One Say Me Some Other regexp For Img Tag Extracing?
I Need to Extract Img tag Using Regular Expressions From The Html Page
<\simg [^>]src\s=\s(["’])(.*?)\1
Is This Code Would be ok
Can Any One Say Me Some Other regexp For Img Tag Extracing?
Newb N. wrote:
I Need to Extract Img tag Using Regular Expressions From The Html Page
<\simg [^>]src\s=\s(["’])(.*?)\1
Is This Code Would be okCan Any One Say Me Some Other regexp For Img Tag Extracing?
Instead of using a regular expression you could consider a html parser ,
and/or do a xpath search to retrieve images. Check hpricot .
On Thu, Aug 21, 2008 at 12:50 PM, Lex W. [email protected] wrote:
Instead of using a regular expression you could consider a html parser ,
and/or do a xpath search to retrieve images. Check hpricot .
Yeah, it is quite easy with Hpricot:
require ‘open-uri’
require ‘hpricot’
site =
Hpricot(open(“http://code.google.com/edu/submissions/SedgewickWayne/index.html”))
site.search(“//img”) #=> returns an array of all images
Thomas W. wrote:
On Thu, Aug 21, 2008 at 12:50 PM, Lex W. [email protected] wrote:
Instead of using a regular expression you could consider a html parser ,
and/or do a xpath search to retrieve images. Check hpricot .Yeah, it is quite easy with Hpricot:
require ‘open-uri’
require ‘hpricot’site =
Hpricot(open(“http://code.google.com/edu/submissions/SedgewickWayne/index.html”))
site.search(“//img”) #=> returns an array of all images
yes i used as this
doc = Hpricot.parse(item.description)
imgs = doc.search(“//img”)
@src_array = imgs.collect{|img|img.attributes[“src”]}
but it gives only the Image Url’s but I need to Get
tag Fully …
Any Helps
Newb N. schrieb:
require ‘open-uri’
doc = Hpricot.parse(item.description)
imgs = doc.search("//img")
@src_array = imgs.collect{|img|img.attributes[“src”]}but it gives only the Image Url’s but I need to Get
tag Fully …
Any Helps
Then do
@src_array = imgs.collect{|img| “<img src =”#{img.attributes[“src”]
}">" }
?
–
Otto Software Partner GmbH
Jan P. (e-mail: [email protected])
Tel. 0351/49723202, Fax: 0351/49723119
01067 Dresden, Freiberger Straße 35 - AG Dresden, HRB 2475
Geschäftsführer: Burkhard Arrenberg, Heinz A. Bade, Jens Gruhl
i’m not really sure about hpricot , but with html/tree parser , when you
call a node’s to_s method , you got it’s full html . So , you should try
to call .to_s on the array’s elements , and see if it’s what you need.
Jan P. wrote:
Newb N. schrieb:
require ‘open-uri’
doc = Hpricot.parse(item.description)
imgs = doc.search("//img")
@src_array = imgs.collect{|img|img.attributes[“src”]}but it gives only the Image Url’s but I need to Get
tag Fully …
Any HelpsThen do
@src_array = imgs.collect{|img| “<img src =”#{img.attributes[“src”]
}">" }?
yes It works…
Is It Possible to Use @src_array into String.sub!(pattern,replacement)
That is
@src_array.sub(/[@src_array]/," ")
@src_array contains all the img tags.i need to replace it empty…
for that will tat above code work?
can u get me there?
–
Otto Software Partner GmbHJan P. (e-mail: [email protected])
Tel. 0351/49723202, Fax: 0351/49723119
01067 Dresden, Freiberger Straße 35 - AG Dresden, HRB 2475
Geschäftsführer: Burkhard Arrenberg, Heinz A. Bade, Jens Gruhl
This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.
Sponsor our Newsletter | Privacy Policy | Terms of Service | Remote Ruby Jobs