Forum: Ruby on Rails sanitizing and stripping some html?

Announcement (2017-05-07): is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see and for other Rails- und Ruby-related community platforms.
2a34c68022ae45d335c77b6ffc412a2f?d=identicon&s=25 Thomas Mango (Guest)
on 2007-04-22 17:33
(Received via mailing list)
I have an application that manages a list of feeds. In a scheduled
BackgrounDRb worker, I parse each of these feeds and post the content
to the same site. Some of these feeds contain HTML in the description
of each item in the feed. I would like to first sanitize the HTML to
remove anything particularly harmful, then I would like to strip
certain tags, leaving the content.

I first tested Rick Olson's white_list plugin. It seems that this
simply strips tags and their content. For example, if I say p is a bad
tag, <p>content</p> gets completely stripped. I would actually like to
keep the 'content' and simply remove the HTML. Certain tags are
alright, such as b, em, strong, but most I would like stripped out.

I then tested and it
seems to do the trick. I was just wondering if anyone else had been
interested in stripping HTML but leaving the content and how they went
about doing so. Thanks for your input.
This topic is locked and can not be replied to.