[Python-Dev] XML codec? (original) (raw)

Walter Dörwald walter at livinglogic.de
Fri Nov 9 14:41:37 CET 2007


Walter Dörwald wrote:

Martin v. Löwis wrote:

Yes, an XML parser should be able to use UTF-8, UTF-16, UTF-32, etc codecs to do the encoding. There's no need to create a magical mystery codec to pick out which though. So the code is good, if it is inside an XML parser, and it's bad if it is inside a codec? Exactly so. This functionality just isn't a codec - there is no encoding. Instead, it is an algorithm for detecting an encoding. And what do you do once you've detected the encoding? You decode the input, so why not combine both into an XML decoder?

In fact, we already have such a codec. The utf-16 decoder looks at the first two bytes and then decides to forward the rest to either a utf-16-be or a utf-16-le decoder.

Servus, Walter



More information about the Python-Dev mailing list