Forum: Ruby on Rails Re: pretty-print and cleanse RHTML?

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
Phlip (Guest)
on 2007-07-22 02:58
(Received via mailing list)
> Suppose someone gave us fresh HTML to import as eRB (.rhtml). Such as from
> an obsolete PHP project. We ought to upgrade, cleanse, and pretty-print
> that HTML like this...
>
>  tidy -i -asxhtml old.html > new.rhtml

Below my sig is a program to temporarily replace <% and %>
with <!--% and %--> and run Tidy. Save it as 'tidyErb.rb', and use this
usage line:

usage: ruby tidyErb.rb <filename.rhtml> >output.rhtml

'filename.rhtml' and 'output.rhtml' may not be the same file. The
program
wastes a file called 'scratch.html', with no attempt to avoid any source
files with the same name...

As a convenience, the program reports diagnostics to STDERR. Obey them
(per
assert_tidy), to improve your programs!

Note that Tidy treats <!-- --> as flow-tags not block-tags. (My
verbiage.
<em> is a flow-tag, and <div> is the cannonical block-tag. Tidy
line-wraps
the former.)

Searching for <% and moving them to their correct indentation (such as
for
<% end %>) is a small price to pay for clean HTML!

Oh, also, review my gsubs to see if they match what Tidy did to your
RHTML's
comments, and <%%> nested inside attributes. If I return to this
project, I
will just upgrade Tidy...

--
  Phlip
  http://www.oreilly.com/catalog/9780596510657/
  "Test Driven Ajax (on Rails)"
  assert_xpath, assert_javascript, & assert_ajax


if ARGV.size != 1 or !File.exist?(filename = ARGV.first)
 puts 'usage: ruby tidyErb.rb <filename.rhtml> >output.rhtml'
 exit
end

rhtml = File.read(filename)
escaped = rhtml.gsub('<%', '<!--%').gsub('%>', '%-->')
File.open('scratch.html', 'w'){|f| f.write(escaped) }
system('tidy -i -asxhtml -m scratch.html')
html = File.read('scratch.html')
File.unlink('scratch.html')
html.gsub!('&lt;!--', '<!--')  #  undo Tidy's CGI-nanny escapes
html.gsub!('--&gt;', '-->')
html.gsub!('%3C!--%', '<!--%')
html.gsub!('%20', ' ')  #  TODO  nest this gsub?
html.gsub!('%--%3E', '%-->')
rhtml = html.gsub('%-->', '%>').gsub('<!--%', '<%')

puts rhtml
This topic is locked and can not be replied to.