Sanitizing Characters In Pasted Text


#1

Because Microsoft Word changes quotes to curly quotes, double dashes to
en-dashes, etc., copied / pasted text from Word into a textarea can
contain
characters not rendered the same in all browsers. Put more bluntly, when
they paste from Word, it looks like cr@p on my Mac.

I¹ve seen this question asked before, but never seen the answer. Is
there
any translation table between Word¹s special characters and HTML named
entities?

Thanks


#2

Hi Steve,

There’s a Perl script called “demoronizer” that does something like
this - perhaps you could use it as a reference:
http://vsbabu.org/mt/archives/2002/11/03/demoronizer.html

Cheers!

-DF