Forum: Ruby Html parses

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
3bbfc41de6c388360e39f27aa60ce5bc?d=identicon&s=25 Marcio Francisco (marciorf)
on 2005-12-08 19:24
Hello

Does anyone knows about an html parser in ruby?
One that separate all tags...

thx
4299e35bacef054df40583da2d51edea?d=identicon&s=25 James Gray (bbazzarrakk)
on 2005-12-08 20:24
(Received via mailing list)
On Dec 8, 2005, at 12:24 PM, Marcio Francisco wrote:

> Hello
>
> Does anyone knows about an html parser in ruby?
> One that separate all tags...

Sure.  Here's one possibility:

http://www.crummy.com/software/RubyfulSoup/

James Edward Gray II
9dfe8c734b0f9b37a4e218425c0a2138?d=identicon&s=25 gene.tani (Guest)
on 2005-12-08 20:45
(Received via mailing list)
Marcio Francisco wrote:
> Hello
>
> Does anyone knows about an html parser in ruby?
> One that separate all tags...
>
> thx
>
> --
> Posted via http://www.ruby-forum.com/.

I'm time trialling python and ruby parser right now, so I'm going to
spew you a list of links for:

html-parser-2
htree
ymHTML module
htmltools: requires patched html-parser (gem)

rubyful soup:
WWW::Mechanize  # built on htmltools, xmltree,
htmltokenizer: handles mismatched tags (gem)

REXML: Tree & stream parsing
(Yeah, that's a lot of libs)

Here's the spew:

http://raa.ruby-lang.org/project/html-parser-2/
http://diveintopython.org/html_processing/index.html
http://cvs.m17n.org/~akr/htree/

http://www.yoshidam.net/Ruby.html#ymHTML

http://ruby-htmltools.rubyforge.org/
http://ruby-htmltools.rubyforge.org/doc/
http://bike-nomad.com/ruby/

http://rubyforge.org/projects/wee/
http://neurogami.com/cafe-fetcher/
http://rubyforge.org/projects/htmltokenizer/

http://www.germane-software.com/software/rexml/
This topic is locked and can not be replied to.