Issue 1470548: xml.sax.saxutils.XMLGenerator cannot output UTF-16 (original) (raw)

process

Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: serhiy.storchaka Nosy List: Arfrever, BreamoreBoy, benjamin.peterson, doerwalter, georg.brandl, larry, loewis, neoecos, ngrig, pitrou, python-dev, serhiy.storchaka
Priority: release blocker Keywords: needs review, patch

Created on 2006-04-14 20:21 by ngrig, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
saxutils.diff ngrig,2006-04-14 20:21 Patch for bug #1470540
XMLGenerator.patch serhiy.storchaka,2012-05-30 07:57 review
XMLGenerator-2.patch serhiy.storchaka,2012-06-15 07:20 review
XMLGenerator-3.patch serhiy.storchaka,2012-07-15 07:08 review
XMLGenerator-4.patch serhiy.storchaka,2013-01-14 13:35 review
XMLGenerator-5.patch serhiy.storchaka,2013-01-20 15:32 review
XMLGenerator_fragment-2.7.patch serhiy.storchaka,2013-02-24 09:08 review
saxutils.py neoecos,2013-03-31 19:33 The patched file
Messages (23)
msg50009 - (view) Author: Nikolai Grigoriev (ngrig) Date: 2006-04-14 20:21
This is a patch to bug #1470540. It enables xml.sax.saxutils.XMLGenerator to work correctly with UTF-16 (and other encodings not derived from US-ASCII). The proposed changes are as follows: - in XMLGenerator.__init__(), create a StreamWriter instead of a plain stream; - in XMLGenerator._write(), convert everything to Unicode before writing; - in XMLGenerator.endDocument(), flush the StreamWriter. The patch is applicable to xml/sax/saxutils.py in the stable release (2.4.3), as well as to xmlcore/sax/saxutils.py in the current release (2.5). The smoke test is attached to the bug description in the Bug Manager. Regards, Nikolai Grigoriev
msg66684 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2008-05-11 22:03
Won't this present backwards-compatibility problems if non-ASCII str content is written?
msg114654 - (view) Author: Mark Lawrence (BreamoreBoy) * Date: 2010-08-22 09:30
The are no unit test or doc changes with the patch. Can anyone answer Georg's question on ?
msg161764 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-05-28 10:43
See also . Instead of codecs.StreamWriter better to use io.TextIOWrapper, because the first is slower and has numerous flaws.
msg161767 - (view) Author: Walter Dörwald (doerwalter) * (Python committer) Date: 2012-05-28 11:07
An alternative would be to use an incremental encoder instead of a StreamWriter. (Which is what TextIOWrapper does internally).
msg161933 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-05-30 07:57
Oh, I see XMLGenerator completely outdated. It even has not been ported to Python 3. See function _write: def _write(self, text): if isinstance(text, str): self._out.write(text) else: self._out.write(text.encode(self._encoding, _error_handling)) In Python 2 there was a choice between bytes and unicode strings. But in Python 3 encoding never happens. XMLGenerator does not distinguish between binary and text streams. Here is a patch that fixes the work of XMLGenerator in Python 3. Unfortunately, it is impossible to avoid the loss of backward compatibility. I tried to keep the code to work for the most common cases, but some code which "worked" before may break (including I had to correct some tests).
msg162851 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-06-15 07:20
The patch updated to reflect Martin's comments. I hope the old behavior now preserved in the most used in practice cases. Tests converted to work with bytes instead of strings.
msg163740 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-06-24 07:20
It would be nice to fix this bug before forking of the 3.3.0b1 release clone.
msg165509 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-07-15 07:08
Here is updated patch with more careful handling of closing (as for ) and added comments.
msg172205 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-10-06 15:10
Ping.
msg175472 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-11-12 20:44
If nobody has any objections, why not apply this patch?
msg178326 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-12-27 20:45
If no one objects I will commit this next year.
msg178369 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2012-12-28 07:26
I'd like Antoine to have a look at all that io stuff. It looks quite bloated. In your except clause, you're not calling self._close.
msg179942 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-01-14 13:35
Patch updated. Fixed an error which Georg have found. Restored testing XMLGenerator with StringIO as Antoine pointed. Now XMLGenerator tested for StringIO, BytesIO and an user writer. Added tests for encoding.
msg180297 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-01-20 15:32
Patch updated. Now I get rid of __del__ to prevent hanging on reference cicles as Antoine suggested on IRC. Added test for check that XMLGenerator doesn't close the file passed as argument.
msg181797 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2013-02-10 12:38
New changeset 010b455de0e0 by Serhiy Storchaka in branch '2.7': Issue #1470548: XMLGenerator now works with UTF-16 and UTF-32 encodings. http://hg.python.org/cpython/rev/010b455de0e0 New changeset 66f92f76b2ce by Serhiy Storchaka in branch '3.2': Issue #1470548: XMLGenerator now works with binary output streams. http://hg.python.org/cpython/rev/66f92f76b2ce New changeset 03b878d636cf by Serhiy Storchaka in branch '3.3': Issue #1470548: XMLGenerator now works with binary output streams. http://hg.python.org/cpython/rev/03b878d636cf New changeset 12d75ca12ae7 by Serhiy Storchaka in branch 'default': Issue #1470548: XMLGenerator now works with binary output streams. http://hg.python.org/cpython/rev/12d75ca12ae7
msg182819 - (view) Author: Arfrever Frehtes Taifersar Arahesis (Arfrever) * (Python triager) Date: 2013-02-23 20:50
The change in 2.7 branch breaks some software, including a test of Django (produce_xml_fragment from https://github.com/django/django/blob/1.4.5/tests/regressiontests/test_utils/tests.py). The problem seems to not occur with Python 3.2, 3.3 and 3.4. Before 010b455de0e0: >>> from StringIO import StringIO >>> from xml.sax.saxutils import XMLGenerator >>> stream = StringIO() >>> xml = XMLGenerator(stream, encoding='utf-8') >>> xml.startElement("foo", {"aaa": "1.0", "bbb": "2.0"}) >>> xml.characters("Hello") >>> xml.endElement("foo") >>> xml.startElement("bar", {"ccc": "3.0", "ddd": "4.0"}) >>> xml.endElement("bar") >>> stream.getvalue() 'Hello' >>> After 010b455de0e0: >>> from StringIO import StringIO >>> from xml.sax.saxutils import XMLGenerator >>> stream = StringIO() >>> xml = XMLGenerator(stream, encoding='utf-8') >>> xml.startElement("foo", {"aaa": "1.0", "bbb": "2.0"}) >>> xml.characters("Hello") >>> xml.endElement("foo") >>> xml.startElement("bar", {"ccc": "3.0", "ddd": "4.0"}) >>> xml.endElement("bar") >>> stream.getvalue() '' >>>
msg182861 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-02-24 09:08
Thank you for report. Here is a patch which fixes this bug.
msg182892 - (view) Author: Arfrever Frehtes Taifersar Arahesis (Arfrever) * (Python triager) Date: 2013-02-24 20:52
This patch works for me.
msg182930 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2013-02-25 11:32
New changeset d707e3345a74 by Serhiy Storchaka in branch '2.7': Issue #1470548: Do not buffer XMLGenerator output. http://hg.python.org/cpython/rev/d707e3345a74
msg182931 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2013-02-25 11:49
New changeset 1c03e499cdc2 by Serhiy Storchaka in branch '3.2': Issue #1470548: Add test for fragment producing with XMLGenerator. http://hg.python.org/cpython/rev/1c03e499cdc2 New changeset 5a4b3094903f by Serhiy Storchaka in branch '3.3': Issue #1470548: Add test for fragment producing with XMLGenerator. http://hg.python.org/cpython/rev/5a4b3094903f New changeset 810d70fb17a2 by Serhiy Storchaka in branch 'default': Issue #1470548: Add test for fragment producing with XMLGenerator. http://hg.python.org/cpython/rev/810d70fb17a2
msg185644 - (view) Author: Sebastian Ortiz Vasquez (neoecos) Date: 2013-03-31 19:33
I have been working with this in order to generate an RSS feed using web2py. I found, XMLGenerator method does not validate if is an unicode or string type, and it does not encode accord the encoding parameter of the XMLGenerator. I added changed the method to verify if is an unicode object or try to convert to it using the desired encoding. Recall that the _write UnbufferedTextIOWrapper receives an unicode object as parameter. def characters(self, content): if isinstance(content, unicode): self._write(escape(content)) else: self._write(escape(unicode(content,self._encoding)))
msg185682 - (view) Author: Arfrever Frehtes Taifersar Arahesis (Arfrever) * (Python triager) Date: 2013-03-31 21:51
Sebastian Ortiz Vasquez: Please file a new issue and attach a patch (in unified format) instead of a whole Python module.
History
Date User Action Args
2022-04-11 14:56:16 admin set github: 43215
2013-03-31 22:03:43 Arfrever set versions: + Python 3.2, Python 3.3, Python 3.4
2013-03-31 21:51:15 Arfrever set messages: + title: Bugfix for #1470540 (XMLGenerator cannot output UTF-16 or UTF-8) -> xml.sax.saxutils.XMLGenerator cannot output UTF-16
2013-03-31 19:33:14 neoecos set files: + saxutils.pynosy: + neoecosversions: - Python 3.2, Python 3.3, Python 3.4messages: + title: Bugfix for #1470540 (XMLGenerator cannot output UTF-16) -> Bugfix for #1470540 (XMLGenerator cannot output UTF-16 or UTF-8)
2013-02-25 11:50:36 serhiy.storchaka set status: open -> closedresolution: fixedstage: resolved
2013-02-25 11:49:19 python-dev set messages: +
2013-02-25 11:32:14 python-dev set messages: +
2013-02-24 20:52:51 Arfrever set messages: +
2013-02-24 09:08:15 serhiy.storchaka set files: + XMLGenerator_fragment-2.7.patchmessages: +
2013-02-23 20:50:30 Arfrever set status: closed -> openpriority: normal -> release blockernosy: + Arfrever, benjamin.peterson, larrymessages: + resolution: fixed -> (no value)stage: resolved -> (no value)
2013-02-10 15:23:06 serhiy.storchaka set status: open -> closedresolution: fixedstage: patch review -> resolved
2013-02-10 12:38:00 python-dev set nosy: + python-devmessages: +
2013-01-20 15:32:51 serhiy.storchaka set files: + XMLGenerator-5.patchmessages: +
2013-01-14 13:36:14 serhiy.storchaka set stage: needs patch -> patch review
2013-01-14 13:35:33 serhiy.storchaka set keywords: - easyfiles: + XMLGenerator-4.patchmessages: +
2012-12-30 18:40:40 serhiy.storchaka set stage: patch review -> needs patch
2012-12-28 07:26:14 georg.brandl set nosy: + pitroumessages: +
2012-12-27 20:47:56 serhiy.storchaka set assignee: serhiy.storchaka
2012-12-27 20:45:56 serhiy.storchaka set messages: +
2012-11-12 20:44:06 serhiy.storchaka set messages: +
2012-10-24 09:02:24 serhiy.storchaka set stage: patch review
2012-10-20 20:09:40 serhiy.storchaka set keywords: + needs reviewstage: test needed -> (no value)versions: + Python 3.4, - Python 3.1
2012-10-06 15:10:51 serhiy.storchaka set messages: +
2012-08-05 11:14:07 serhiy.storchaka link issue4997 superseder
2012-07-20 06:58:46 eli.bendersky set nosy: - eli.bendersky
2012-07-15 07:08:12 serhiy.storchaka set files: + XMLGenerator-3.patchnosy: + eli.benderskymessages: +
2012-06-24 07:20:37 serhiy.storchaka set messages: +
2012-06-15 07:20:50 serhiy.storchaka set files: + XMLGenerator-2.patchmessages: +
2012-05-30 07:58:37 serhiy.storchaka set nosy: + loewis
2012-05-30 07:57:37 serhiy.storchaka set files: + XMLGenerator.patchmessages: +
2012-05-28 11:07:58 doerwalter set nosy: + doerwaltermessages: +
2012-05-28 10:43:25 serhiy.storchaka set nosy: + serhiy.storchakamessages: + versions: + Python 3.3
2010-08-22 09:30:57 BreamoreBoy set nosy: + BreamoreBoymessages: + versions: + Python 3.1, Python 2.7, Python 3.2, - Python 2.6
2009-04-05 13:45:12 georg.brandl link issue1470540 superseder
2009-04-05 13:45:12 georg.brandl unlink issue1470540 dependencies
2009-03-21 02:02:41 ajaksu2 set stage: test neededtype: behaviorversions: + Python 2.6, - Python 2.5
2009-03-21 02:02:11 ajaksu2 link issue1470540 dependencies
2008-05-11 22:03:08 georg.brandl set nosy: + georg.brandlmessages: +
2008-01-21 13:57:10 akuchling set keywords: + easy
2006-04-14 20:21:23 ngrig create