Hello all.
Subj at http://uniforma.rubyforge.org
== DESCRIPTION:
Uniforma is a library for parsing text formats and convert them to other
formats. The main goal of library design was to make text
parser/generator
definition an easy task with pretty DSLs for Parser and Generator.
As for now, library distribution includes only Textile parser and HTML
generator, but it should be really easy to extend it.
== USAGE:
Uniforma parses any text string into set of fragments (DOM structure),
and
then allows you to call any generator onto those fragments.
Basic usage is like:
doc = Uniforma::textile(“some bold text”) #=>
Uniforma::Dom::Document
doc.to_html #=> “
some bold
text
or:
doc = Uniforma::textile_string(“some bold text”) #=> Array of
Uniforma::Dom::Fragment
doc.to_html #=> “some bold text”
So, any defined parser adds two methods into Uniforma module:
Uniforma::<parser_name>(text) #=> Uniforma::Dom::Document
Uniforma::<parser_name>_string(text) => Array of
Uniforma::Dom::Fragments
And any defined generator adds to Document and Array its method:
Document#to_<generator_name> #=> string
Array#to_<generator_name> #=> string
Generating methods also allow some on-the-fly “editing”:
Uniforma.textile(some_text).to_html do
#rewrite relative links (no “:”) to absolute ones
rewrite :link, :href, %r{^[^:]+$} do |url| ‘http://site.com/’ + url
end
end
It is also planned to provide DOM navigation methods (like those in
Hpricot[http://code.whytheluckystiff.net/hpricot/]), but they’re still
not
ready.
== PARSERS
Complex format parsers should be descendants of Uniforma::Parser class.
They
should utilize Parser#paragraph and Parser#fragment methods and their
shorcuts. See Parser for details.
Parsers for simple line-based text formats (wiki-like ones, having
almost
any line as a new paragraph of some type) can be inherited from
Uniforma::LineParser, which provides very easy DSL to define how to
parse
every line, and formatted fragments inside lines. Here’s a small
example:
module Uniforma::Parsers
class Example < LineParser
definition do
# == Line parsing rules
# header with level defined by quantity of '=' symbols
line %r{^(=+)\s*(.+)$} do |prefix, text| header(prefix.length,
text)
end
# plain paragraph
line %r{^(.+)$} do |text| para(text) end
# plain paragraph continued
line_after :para, %r{^(.+)$} do |txt| text(txt) end
# empty line - sign of "no paragraph"
line '' do nop end
# == Inline formatting parsing rules
inline %r{\*(.+?)\*} do |text| fmt(:bold, text) end
#...blah...
end
end
end
See LineParser for details.
== GENERATORS
Uniforma::Generators::Generator is the base class for all generators,
defining simple DSL for all actions (basically, it’s like what to do
before
some DOM fragment, after it, and with textual data; plus a few more
complex
handlers).
For defining generators which output is just text string, there is
TextGenerator base class. Example of some definitions:
module Uniforma::Generators
class Example < TextGenerator
definition do
# “before” as block handler
before :header do |fragment| “<h%i>” % fragment.level end
# "after" as block handler
after :header do |fragment| "</h%i>" % fragment.level end
# before/after shortcuts
before :quote, "<blockquote>"
after :quote, "</blockquote>\n"
# composite shortcut
around :para, "<p>", "</p>\n"
# replace some fragment (it's text will not be printed)
replace :img do |fragment| "<img src='%s'/>" % fragment.text end
# some textual conversions on content - parameters syntax is
like
gsub
text_conversion /&/, “&”
text_conversion /</, “<”
text_conversion />/, “>”
# on-the-fly change of some fragment attributes
rewrite :link, :href, /^\w+$/ do |url|
“Wikipedia, the free encyclopedia” + url end
end
end
end
See TextGenerator for details.
== CHANGES SINCE 0.0.1
- Library became much more mature: clean code, specs, some docs, almost
complete parser for Textile and generator for HTML. - I’ve decided to remove temporary incomplete parsers and generators. It
means, now we have only one, Textile=>HTML conversion. IT DOESN’T MEAN I
want to “narrow” the library goal. Extending available parsers and
generators list is a high-priority task for near versions.
V.