Hi! I m new to webscraping but hve rubyful soup in the past. I just wanted to know tht hw cn i scrape a particular item frm the page instead of scraping the entire page. E.g Hvng got the URL for a person's profile can i get only the name of the person scraped?? plz help. Regards, Swanand.
on 2007-02-01 10:53
on 2007-02-01 22:39
On 2/1/07, swanand deodhar <firstname.lastname@example.org> wrote: > Hi! > I m new to webscraping but hve rubyful soup in the past. I just wanted to > know tht hw cn i scrape a particular item frm the page instead of scraping > the entire page. E.g Hvng got the URL for a person's profile can i get only > the name of the person scraped?? > plz help. > Regards, > Swanand. Use Hpricot: http://code.whytheluckystiff.net/hpricot/ -- Zack Chandler http://depixelate.com
on 2007-11-18 09:43
On Nov 17, 8:03 am, venkat <ven...@nospam.com> wrote: > Is there a library/framework for scraping (web)? > > I have a few scrapers written but would like to see if there are any > libraries available. I don't mean Mechanize and Hpricot or any other > parsers for (X)HTML. If you don't mean those, what do you mean? You can always simply fetch the raw page source and run regexps on it. Is that more what you mean?
on 2007-11-18 09:54
Phrogz wrote: > On Nov 17, 8:03 am, venkat <ven...@nospam.com> wrote: >> Is there a library/framework for scraping (web)? Yeah. I wrote a little article on this about a year ago, and I almost fell off the chair when it was referenced in 'Learning Ruby' from O'Reilly. It describes different web scraping possibilities in Ruby: http://www.rubyrailways.com/data-extraction-for-we... Since then I wrote a web scraping framework, scRUBYt! - based on the gem download stats (nearly 8000) it's very popular. It's also very actively developed and ... well enough self-advertisement, please read the rubyrailways article and decide it for yourself :-) Cheers, Peter ___ http://www.rubyrailways.com http://scrubyt.org
on 2007-11-18 17:35
You might also like scrAPI http://rubyforge.org/projects/scrapi/ -Daniel Brumbaugh Keeney
on 2007-12-01 10:58
On Nov 16, 9:36 am, venkat <venkat@> wrote: > TIA > > -Venkat You can check SWExplorerAutomation (SWEA) from http://webius.net. SWEA separates UI elements binding from the automation script. It makes SWEA automation scripts more more resilient to UI changes and dramatically decreases time needed for the script maintainance.