[Python-Dev] Elementtree and Namespaces in 2.5 (original) (raw)

Chris S chrisspen at gmail.com
Fri Aug 11 17:15:25 CEST 2006


I'm happy to see Elementtree being considered for inclusion with 2.5. However, before committing to this decision, there's an issue regarding it's namespace parsing that should be addressed. Although Elmenttree is in most respects an excellent XML parser, a huge gotcha that often makes Elementtree unsuitable for many applications lies in the way it arbitrarily renames namespaces.

For example, given:

<h:html xmlns:xdc="http://www.xml.com/books" xmlns:h="http://www.w3.org/HTML/1998/html4"> <h:head><h:title>Book Review <h:body> xdc:bookreview xdc:titleXML: A Primer <h:table> <h:tr align="center"> <h:td>Author<h:td>Price <h:td>Pages<h:td>Date <h:tr align="left"> <h:td>xdc:authorSimon St. Laurent <h:td>xdc:price31.98 <h:td>xdc:pages352 <h:td>xdc:date1998/01

Elementtree would rewrite this as:

<ns0:html xmlns:ns0="http://www.w3.org/HTML/1998/html4"> ns0:headns0:titleBook Review ns0:body <ns1:bookreview xmlns:ns1="http://www.xml.com/books"> ns1:titleXML: A Primer ns0:table <ns0:tr align="center"> ns0:tdAuthorns0:tdPrice ns0:tdPagesns0:tdDate <ns0:tr align="left"> ns0:tdns1:authorSimon St. Laurent ns0:tdns1:price31.98 ns0:tdns1:pages352 ns0:tdns1:date1998/01

There's been some discussion in comp.lang.python about this functionality (http://groups.google.com/group/comp.lang.python/browse_thread/thread/31b2e9f4a8f7338c/363f46513fb8de04?&rnum=3&hl=en) and while most users and the w3 spec (http://www.w3.org/TR/2001/REC-xml-c14n-20010315#NoNSPrefixRewriting) agree this feature is actually a bug, Fredrik Lundh has refused to fix this problem. Of course, this is his right. Unfortunately, Elementtree's design makes a work-around rather awkward. Therefore, we might want to rethink inclusion of Elementtree in the stdlib, or at least patch the stdlib's version of Elementtree to produce an output more in line with the w3 standard.

Sincerely, Chris Spencer



More information about the Python-Dev mailing list