Issue 1399: XML codec - Python tracker (original) (raw)

Created on 2007-11-07 17:52 by doerwalter, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
diff.txt doerwalter,2007-11-07 17:52
diff2.txt doerwalter,2007-11-08 21:25
Messages (9)
msg57211 - (view) Author: Walter Dörwald (doerwalter) * (Python committer) Date: 2007-11-07 17:52
The patch adds an XML codec. It implements encoding detection as specified in http://www.w3.org/TR/2004/REC-xml-20040204/#sec-guessing and supports externally specified encodings for both encoding and decoding.
msg57213 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2007-11-07 17:53
I think it's good to add this; I don't have time to review though.
msg57221 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2007-11-07 19:43
Nice codec ! The only nit I have is the name: "xml" isn't intuitive enough. I had to read the code to figure out what the codec actually does. "xml" used a encoding usually refers to having Unicode text converted to ASCII with XML entity escapes for all non-ASCII characters. How about "xml-auto-detect" or something along those lines ?!
msg57222 - (view) Author: Walter Dörwald (doerwalter) * (Python committer) Date: 2007-11-07 21:42
"xml-auto-detect" sounds OK to me, it even makes sense for the encoder, because it normally detects the encoding to use for writing from the XML declaration. We could put "xml-auto-detect" into the alias mapping and keep xml as the module name. But I noticed I have to rewrap a lot of lines, before I check it in.
msg57224 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2007-11-07 21:54
Leaving the module name as "xml" would remove that name from the namespace of possible encodings. "xml" as encoding name is problematic, as many people regard writing data in XML as "encoding the data in XML". I'd simply not use it at all, not even for a codec that converts between Unicode and ASCII+XML entities.
msg57280 - (view) Author: Walter Dörwald (doerwalter) * (Python committer) Date: 2007-11-08 21:25
OK, I've changed the name of the codec to xml_auto_detect and added support for EBCDIC.
msg57281 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2007-11-08 21:37
Thanks, Walter !
msg63696 - (view) Author: Sean Reifschneider (jafo) * (Python committer) Date: 2008-03-17 17:52
Marc-Andre: Is this good to be committed, or does it need to be reviewed further?
msg63703 - (view) Author: Walter Dörwald (doerwalter) * (Python committer) Date: 2008-03-17 18:14
There was resistance in python-dev against this patch (see the thread at http://mail.python.org/pipermail/python-dev/2007-November/075138.html), so this issue should probably closed as rejected. However there was consensus, that a detect_xml_encoding() function might be usefull.
History
Date User Action Args
2022-04-11 14:56:28 admin set github: 45740
2008-03-18 15:14:31 jafo set status: open -> closedresolution: rejected
2008-03-17 18:14:25 doerwalter set messages: +
2008-03-17 17:52:30 jafo set priority: normalassignee: lemburgmessages: + nosy: + jafo
2007-11-08 21:37:27 lemburg set messages: +
2007-11-08 21:25:53 doerwalter set files: + diff2.txtmessages: +
2007-11-07 21:59:56 gvanrossum set nosy: - gvanrossum
2007-11-07 21:54:18 lemburg set messages: +
2007-11-07 21:42:20 doerwalter set messages: +
2007-11-07 19:43:06 lemburg set nosy: + lemburgmessages: +
2007-11-07 17:53:57 gvanrossum set nosy: + gvanrossummessages: +
2007-11-07 17:52:18 doerwalter create