Forum: Redcloth textile...

D893e113b51a8f200d2abb3ed9e54143?d=identicon&s=25 Gaspard Bucher (gazoduc)
on 2009-06-17 22:52
(Received via mailing list)
Hi list !

>From the tests provided by Jason in his treetop experiment I started
to get the feeling that it can be very hard to write a token based
parser for textile since sometimes you need to know all the characters
of the current line to know if "**" actually start a bold sentence.
For example:

This **is not a bold sentence, even if you would think it is, it's not
**. Believe me.

I had a look at the original implementation of textile in PHP and they
actually run tons of very complicated regexps. This is an extract to
parse inline elements (bold, em, ...): http://gist.github.com/131486.

All this to say that I think I will abort my "move forward" parser
solution and will try another route: the "split" parser:

1. split text into paragraphs/tables
2. split paragraphs into inline elements (loop until no more split)
3. split inline elements into links, etc
4. continue spliting and replacing

This is the fastest way I can imagine to parse elegantly something like
textile.

I'll let you know when I have a prototype...

Gaspard

PS: forget about my other message on word processing, I actually did
not understand that these specs were only related to some internal
word parser.
This topic is locked and can not be replied to.