Stripping HTML tags from a string


#1

Hello,

Is there a common way of stripping html tags from a string? Right now
I’m
just calling gsub!(/<.*?>/, ‘’), but with a background in PHP and always
having used its strip_tags() method, I wonder if the Rails community has
standardized this fairly common task with something a bit less
simpleminded
than my quick fix.

Thanks!
Zack


#2

On Mon, May 29, 2006 at 07:42:41PM -0700, Zack H. wrote:

Is there a common way of stripping html tags from a string? Right now I’m just
calling gsub!(/<.*?>/, ‘’), but with a background in PHP and always having used
its strip_tags() method, I wonder if the Rails community has standardized this
fairly common task with something a bit less simpleminded than my quick fix.

Suspect you want the erb html_escape method
http://www.ruby-doc.org/stdlib/libdoc/erb/rdoc/classes/ERB/Util.html#M000688html_escape

<%= h string %>

-jim


#3

On Tue, May 30, 2006 at 03:51:47AM +0100, Jim C. wrote:

On Mon, May 29, 2006 at 07:42:41PM -0700, Zack H. wrote:

Is there a common way of stripping html tags from a string? Right now I’m just
calling gsub!(/<.*?>/, ‘’), but with a background in PHP and always having used
its strip_tags() method, I wonder if the Rails community has standardized this
fairly common task with something a bit less simpleminded than my quick fix.

Suspect you want the erb html_escape method

Or I suspect that I should actually read what you wrote … :slight_smile:

http://www.php.net/manual/en/function.strip-tags.php describes the php
strip_tags, and points out a number of interesting problems that it
might have under some circumstances, and a few good workarounds.

Simply gsubbing the <…> instances away isn’t quite the same as
rendering the document, but I can’t see anything that does that. I guess
finding a “nicer” replacement depends on your source data, and what
you’re trying to achieve …

-jim


#4

If I escape the html, it just converts the html entities into the &??;
form.
I want to strip html tags, and end up with a string containing no html
tags. This is the PHP equivalent of what I’m talking about:
http://us2.php.net/manual/en/function.strip-tags.php

Thanks,
Zack


#5

Just what I was looking for. This is odd though, I’m running the
following

html = ActionView::Helpers::TextHelper::strip_tags html

And getting a NoMethodError. rails --version tells me I am indeed
running
1.1.2 though. Any ideas?

Thanks,
Zack


#6

Well, perhaps strip_tags isn’t perfect, but I’m not necessarily looking
for
a port here. I’m just looking for a method whose stated goal is to
remove
html tags from a string.


#7

I’ve been using this ruby sanitize:

http://blog.ideoplex.com/2005/03/17.html

which lets you specify some tags to allow and strips the rest.

cheers,

AF


#8

http://railsmanual.org/module/ActionView%3A%3AHelpers%3A%3ATextHelper/strip_tags