I have some XML looking like the following, other than being very much
larger (some files are up to 2GB):
<server_url>http://myserver.edu/data/</server_url> <server_name>myserver.edu</server_name> <uploads> <result> <dir>/storage/data/results/</dir> <result_name>hadcm3l_00012_00000118_0</result_name> <file_info> <name>hadcm3l_00012_00000118_0_6.zip</name> <nbytes>5154055</nbytes> <md5_checksum>485600296bb601ab4a3d1d49a9fb1c86</md5_checksum> </file_info> <file_info> <name>hadcm3l_00012_00000118_0_7.zip</name> <nbytes>5153055</nbytes> <md5_checksum>36a600296cb60229a3d1d49a9fb1a10</md5_checksum> </file_info> </result> </uploads>
I’ve tried a few xml parsers such as xml-simple, libxml and quixml, but
all reject this data as badly formed. One answer would, of course, be
for the data to be re-generated using properly formed xml. Meanwhile, is
there anything that could be done with the existing files? Is it a case
of having to write regexps to parse this sort of thing?