Pretty-print and cleanse RHTML?

Thanks’ y’all for the answer to an easy question - ri.

Now here’s a sick one.

HTML Tidy is an excellent program and system (by Dave Raggett) that
cleans
and pretty-prints HTML. Below my sig is an assertion that uses
assert_tidy
to scan a @response.body, in a Rails functional test, and complain about
any
shenanigans in the HTML. (Take out the assert_xml call if you would like
to
use it without my assert_xpath plugin.)

Note that web browsers forgive shenanigans, but Rails developers should
not,
because the cleanest HTML code is easiest to test.

Suppose someone gave us fresh HTML to import as eRB (.rhtml). Such as
from
an obsolete PHP project. We ought to upgrade, cleanse, and pretty-print
that
HTML like this…

tidy -i -asxhtml old.html > new.rhtml

That upgrades the HTML, fixes missing and broken tags, etc.

Now suppose someone forgot to do that, and they invested their new
.rhtml
file with lots of <%%>, containing if statements and code-generating
things.

Has anyone invented a pretty-printer that skips over the <%%> tags?

If not, I will presently report how to tidy that code by replacing <%
and %>
with , running tidy, and switching the tags back…


Phlip
http://www.oreilly.com/catalog/9780596510657/
“Test Driven Ajax (on Rails)”
assert_xpath, assert_javascript, & assert_ajax

def assert_tidy(messy = @response.body, verbosity = :noisy)
scratch_html = RAILS_ROOT + ‘/…/scratch.html’ # TODO tune me!
File.open(scratch_html, ‘w’){|f| f.write(messy) }
gripes = tidy -eq #{scratch_html} 2>&1
gripes.split(“\n”)

exclude, inclued = gripes.partition do |g|
  g =~ / - Info\: /                                  or
  g =~ /Warning\: missing \<\!DOCTYPE\> declaration/ or
  g =~ /proprietary attribute/                       or
  g =~ /lacks "(summary|alt)" attribute/
end

puts inclued if verbosity == :noisy
assert_xml `tidy -wrap 1001 -asxhtml #{scratch_html} 2>/dev/null`

end