Hey, I’m pretty new to Ruby and am trying to absorb the “ruby way” of
things as much as I can. What’s the best way to solve this problem:
I have a string that contains html formating. All I want is the plain
text.
Hey, I’m pretty new to Ruby and am trying to absorb the “ruby way” of
things as much as I can. What’s the best way to solve this problem:
I have a string that contains html formating. All I want is the plain
text.
On 5/5/06, Joe C. [email protected] wrote:
Hey, I’m pretty new to Ruby and am trying to absorb the “ruby way” of
things as much as I can. What’s the best way to solve this problem:I have a string that contains html formating. All I want is the plain
text.
I don’t know of any libraries offhand that can do this (CGI only has
escape/unescape), but it’s fairly simple:
html_string.gsub(/<[^>]+>/, “”)
Replacing that regex with something better, probably.
I’ve been using
html_string.gsub(/<.*?>/,"")
for this. But it’s always seemed more a “perl way” than a “ruby way”.
Joseph Michaels wrote:
I have a string that contains html formating. All I want is the plain
text.
html_string.gsub(/<[^>]+>/, “”)
Replacing that regex with something better, probably.
gsubbing breaks down for more complex test cases, such as things
containing source code, or problematic attribute strings (e.g. ).
If you really want to be accurate, I suggest using an XML or HTML
parsing tool, such as Mechanize or RubyfulSoup.
Pistos
On Sat, 2006-05-06 at 02:28 +0900, Joe C. wrote:
Hey, I’m pretty new to Ruby and am trying to absorb the “ruby way” of
things as much as I can. What’s the best way to solve this problem:I have a string that contains html formating. All I want is the plain
text.
There’s more to this than meets the eye - it’s often best to hand off
the hard stuff to someone else
def fetchtext(uri)
lynx --dump #{uri}
end
puts fetchtext(‘www.google.com’)
This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.
Sponsor our Newsletter | Privacy Policy | Terms of Service | Remote Ruby Jobs