HTML entites in ClothRed: yay or nay?

In preparation of release 0.3.0 (hopefully later today ;), I’m
wondering: Should I translated HTML entities into human-readable format?
My hunch is yes, as that’s the point of Textile.

So, I’m at a bit of a loss: I’ve never, ever worked with character
encodings (I don’t even know how to check the encoding on Linux or
Windows).

So, my question is, how do I replace the HTML entities with ISO-8859-1
characters?

The trouble is, that the character encodings don’t seem to be taken off
of UTF-8 or something else that I can just escape, or can I?

Meanwhile, I’m digging through the RDoc documentation. Hopefully, I can
find something there.


Phillip “CynicalRyan” Gawlowski
http://cynicalryan.110mb.com/
http://clothred.rubyforge.org

Rule of Open-Source Programming #13:

Your first release can always be improved upon.

It seems like you need HTMLEntities
(http://htmlentities.rubyforge.org/) which will add a dependency on
your distribution (but better than repeating the same effort others
did).

I think that dependency may be made optional and ClothRed could throw
an exception when asked to decode HTML entities and could not find
that module. I don’t know if this is acceptable to you.

From the docs, code like

 require 'htmlentities'
 coder = HTMLEntities.new
 string = "élan"
 coder.decode(string) # => "élan"

take your HTML with entities into UTF-8 characters if I understood
correctly.

Cheers,
Adriano F…

All HTML can be coded in ASCII, as can XML and XHTML
However, that is simply the markup itself.
Do not forgo the encoding.
Convert everything to UTF-8
There is no reason to use anything else.
Visit:
http://www.unicode.org/charts/

http://www.unicode.org/