If you get errors complaining of undefined entities like when parsing xhtml it means you need to install the DTD for xhtml 1.0 or 1.1. Example of a doctype for xhtml 1.1: <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd"> You want to install the DTDs locally following the model in /etc/xml. If you don't libxml will fetch the DTD from www.w3.org each time you parse a document. Needing to install these DTDs was not obvious to me and should be part of the documentation. There a rpm for xhtml 1.0 - "xhtml1-dtds-1.0-7". I couldn't find one for xhtml 1.1 so I downloaded it piecemeal from w3.org. Installing the DTD does not automatically turn on validation. If you want to validate you need to turn it on: XML::Parser::default_validity_checking = TRUE XML::Parser::default_load_external_dtd controls the loading of the 'external subset' (the definition for the character entities like &. It is defaulted to TRUE. XML::Parser::default_load_external_dtd is broken. This fixes it. Index: ruby_xml_parser.c ========================================================== RCS file: /var/cvs/xml-tools/libxml-ruby/ruby_xml_parser.c,v retrieving revision 220.127.116.11 diff -r18.104.22.168 ruby_xml_parser.c 274c274 < if (xmlSubstituteEntitiesDefaultValue) --- > if (xmlLoadExtDtdDefaultValue) 916c916 < ruby_xml_parser_default_load_external_dtd_set, 0); --- > ruby_xml_parser_default_load_external_dtd_get, 0); 918c918 < ruby_xml_parser_default_load_external_dtd_get, 1); --- > ruby_xml_parser_default_load_external_dtd_set, 1); Sam's patches for libxml are also needed: http://www.intertwingly.net/blog/2005/11/05/Patch-...
on 2005-12-17 00:19
on 2005-12-17 00:42
Jon Smirl wrote: > If you get errors complaining of undefined entities like when > parsing xhtml it means you need to install the DTD for xhtml 1.0 or > 1.1. > > Example of a doctype for xhtml 1.1: > <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" > "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd"> > > <snip explanation & code due to ruby-forum.com /> > > Sam's patches for libxml are also needed: > http://www.intertwingly.net/blog/2005/11/05/Patch-... Thank you for this! E -- This document is NOT valid XHTML 1.0!
on 2005-12-17 01:13
On Fri, 16 Dec 2005 23:18:54 -0000, Jon Smirl <firstname.lastname@example.org> wrote: > If you get errors complaining of undefined entities like when > parsing xhtml it means you need to install the DTD for xhtml 1.0 or > 1.1. > Thanks for that. I've been gathering up problems and patches in a quiet sort of way, but I'm not sure at the moment what's happening with the project. I'm planning to get proactive this week and see if we can at least get these issues sorted and the patches I have (including yours and Sam's) in. Thanks, Ross