I am saving some html code in an Array, I then want to extract certain
values from the html code.
I have saved the html code into an Array but having difficulty
manipulating the elements to extract the exact value that I require from
the html code.
Here is the code:
#searches for images in the page which indicate platform number and
stores them in an array
platImages = (@doc/‘img[@width^=“54”]’)
#above search returns an unwanted image, so delete
platImages.pop
#puts platImages
#go through images names to get platform number
platImages.each { |img|
#convert element to string
img.to_s
#in this part, I have tried various string manipulation
methods to extract the values that i need, but doesn’t seem to work, it
either returns nil for each element or throws an error
I have saved the html code into an Array but having difficulty
manipulating the elements to extract the exact value that I require from
the html code.
What difficulty?
� � � � � � � � #in this part, I have tried various string manipulation
methods to extract the values that i need, but doesn’t seem to work, it
either returns nil for each element or throws an error
What do you want to do? What have you tried? What happened? Why wasn’t
it what you wanted?
Any help is greatly appreciated.
Provide more details, get more help
require ‘rubygems’
require ‘hpricot’
require ‘open-uri’ #open the web page
@doc= Hpricot(open(“http://www.website.com”))
#searches for images in the page and stores them in an array
images = (@doc/'img[@width^="54"]')
#puts images
#go through images names to get platform number
images.each { |img|
img.to_s
#extract characters one to four from image name
img[0,4]
puts img
}
However this returns the error ‘wrong number of arguments (2 for 1)
(ArgumentError)’
string manipulation
Provide more details, get more help
#puts images
#go through images names to get platform number
images.each { |img|
img.to_s #extract characters one to four from image name
img[0,4]
Assuming your problem is here^
img.to_s does not convert the image to a string (img is a dom element,
right?)
Try something like img.to_s[0,4] or img[‘src’][0,4] (but please look
in the documentation for Hpricot to see how to get the part you want
as a string).
string manipulation
Provide more details, get more help
#puts images
#go through images names to get platform number
images.each { |img|
img.to_s #extract characters one to four from image name
img[0,4]
Assuming your problem is here^
img.to_s does not convert the image to a string (img is a dom element,
right?)
Try something like img.to_s[0,4] or img[‘src’][0,4] (but please look
in the documentation for Hpricot to see how to get the part you want
as a string).