Forum: Ruby on Rails Re: pretty-print and cleanse RHTML?

Announcement (2017-05-07): is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see and for other Rails- und Ruby-related community platforms.
Aafa8848c4b764f080b1b31a51eab73d?d=identicon&s=25 Phlip (Guest)
on 2007-07-22 00:58
(Received via mailing list)
> Suppose someone gave us fresh HTML to import as eRB (.rhtml). Such as from
> an obsolete PHP project. We ought to upgrade, cleanse, and pretty-print
> that HTML like this...
>  tidy -i -asxhtml old.html > new.rhtml

Below my sig is a program to temporarily replace <% and %>
with <!--% and %--> and run Tidy. Save it as 'tidyErb.rb', and use this
usage line:

usage: ruby tidyErb.rb <filename.rhtml> >output.rhtml

'filename.rhtml' and 'output.rhtml' may not be the same file. The
wastes a file called 'scratch.html', with no attempt to avoid any source
files with the same name...

As a convenience, the program reports diagnostics to STDERR. Obey them
assert_tidy), to improve your programs!

Note that Tidy treats <!-- --> as flow-tags not block-tags. (My
<em> is a flow-tag, and <div> is the cannonical block-tag. Tidy
the former.)

Searching for <% and moving them to their correct indentation (such as
<% end %>) is a small price to pay for clean HTML!

Oh, also, review my gsubs to see if they match what Tidy did to your
comments, and <%%> nested inside attributes. If I return to this
project, I
will just upgrade Tidy...

  "Test Driven Ajax (on Rails)"
  assert_xpath, assert_javascript, & assert_ajax

if ARGV.size != 1 or !File.exist?(filename = ARGV.first)
 puts 'usage: ruby tidyErb.rb <filename.rhtml> >output.rhtml'

rhtml =
escaped = rhtml.gsub('<%', '<!--%').gsub('%>', '%-->')'scratch.html', 'w'){|f| f.write(escaped) }
system('tidy -i -asxhtml -m scratch.html')
html ='scratch.html')
html.gsub!('&lt;!--', '<!--')  #  undo Tidy's CGI-nanny escapes
html.gsub!('--&gt;', '-->')
html.gsub!('%3C!--%', '<!--%')
html.gsub!('%20', ' ')  #  TODO  nest this gsub?
html.gsub!('%--%3E', '%-->')
rhtml = html.gsub('%-->', '%>').gsub('<!--%', '<%')

puts rhtml
This topic is locked and can not be replied to.