Forum: Ruby Gathering Links

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
6dab365a82517fb694650a57ee88e4a4?d=identicon&s=25 joey__ (Guest)
on 2006-01-20 00:46
Hello,
I am looking for some help on a regex expression. I would like a regexp
that matches against Html Links. I have tried, but I can't seem to get
anything working. I would appreciate help.

Thanks
Joey
Fe9b2d0628c0943af374b2fe5b320a82?d=identicon&s=25 Eero Saynatkari (rue)
on 2006-01-20 01:32
joey__ wrote:
> Hello,
> I am looking for some help on a regex expression. I would like a regexp
> that matches against Html Links. I have tried, but I can't seem to get
> anything working. I would appreciate help.

You might just want to run the HTML through htmltidy
to generate an XML document and parse that or then use
the htree library for the same purpose, it would probably
be the more robust solution.

On the other hand, if you want to use regexps,
something like this would work (though not tested).

First you have to match the beginning tag
(there might be some whitespace:

  /<\s*a

Next, gather any attributes in the opening tag:

   ([^>]*)>

The link text comes next:

   (.*?)

The text section is ended by the closing anchor
tag (no other tags are appropriate):

   <\s*\/\s*a[^>]*>

Finally, we want to match case-insensitively
(A vs. a) and over multiple lines:

  /im

So, $1 will be the attributes and $2 the link text.

> Thanks
> Joey


E
This topic is locked and can not be replied to.