scRUBYt! - Hpricot and WWW::Mechanize on even more steroids,

Hello all,

scRUBYt! version 0.2.6 has been released with some great new features,
tons of bugfixes and lot of changes overall which should greatly affect
the reliability of the system.

============
What’s this?

scRUBYt! is a very easy to learn and use, yet powerful Web scraping
framework based on Hpricot and mechanize. It’s purpose is to free you
from the drudgery of web page crawling, looking up HTML tags,
attributes, XPaths, form names and other typical low-level web scraping
woes by figuring these out from your examples copy’n’pasted from the Web
page.

===========
What’s new?

A lot of long-awaited features have been added: most notably, automatic
crawling to the detail pages, which was the most requested feature in
scRUBYt!’s history ever.

Another great addition is the improved example generation - you don’t
have to use the whole text of the element you would like to match
anymore - it is enough to specify a substring, and the first element
that contains the string will be returned. Moreover, it is possible to
create compound examples like this:

flight :begins_with => ‘Arrival’, :contains /\d{4}/, :ends_with =>
‘20:00’

The crawling through next links has been greatly improved - it is
possible to use images as next links, to generate URLs instead of
clicking on the next link, and a great deal of bugs (including the
infamous google next link problem) have been fixed.

An enormous amount of bugs were fixed and the whole system was tested
thoroughly, so the overall reliability should be improved a lot as
opposed to the previous releases.

Something non-software related: 4 people have joined the development, so
I guess there is much, much more to come in the future!

=========
CHANGELOG

============
Announcement

On popular demand, there is a new forum to discuss everything scRUBYt!
related:

http://agora.scrubyt.org

You are welcome to sign up tell your opinion, ask for features, report
bugs or discuss stuff - or to just look around what other’s are saying.

================
Closing thoughts

Please keep the feedback coming - your contributions are a key factor to
scRUBYt!’s succes. This is not an exaggeration or a feeble attempt at
flattery - since we (obviously) can not test everything on every
possible page, we can make scRUBYt! truly powerful only if you send us
all the quirks and problems you encounter during scraping, as well as
your suggestions and ideas. Thanks everyone!

Cheers,
Peter
__
http://www.rubyrailways.com :: Ruby and Web2.0 blog
http://scrubyt.org :: Ruby web scraping framework
http://rubykitchensink.ca/ :: The indexed archive of all things Ruby.

Peter,

Apologies for the brevity, on a blackberry.

All but two of the unit tests are passing with firewatir. Can you
confirm what the proxy and mechanize_doc params are used for in the
fetch method? Couldn’t find them used anywhere. Mind if I rename
methods and variable away from being so mechanize specific?

Hope to commit changes to my 3.0 tag tomorrow afternoon