msg50009 - (view) |
Author: Nikolai Grigoriev (ngrig) |
Date: 2006-04-14 20:21 |
This is a patch to bug #1470540. It enables xml.sax.saxutils.XMLGenerator to work correctly with UTF-16 (and other encodings not derived from US-ASCII). The proposed changes are as follows: - in XMLGenerator.__init__(), create a StreamWriter instead of a plain stream; - in XMLGenerator._write(), convert everything to Unicode before writing; - in XMLGenerator.endDocument(), flush the StreamWriter. The patch is applicable to xml/sax/saxutils.py in the stable release (2.4.3), as well as to xmlcore/sax/saxutils.py in the current release (2.5). The smoke test is attached to the bug description in the Bug Manager. Regards, Nikolai Grigoriev |
|
|
msg66684 - (view) |
Author: Georg Brandl (georg.brandl) *  |
Date: 2008-05-11 22:03 |
Won't this present backwards-compatibility problems if non-ASCII str content is written? |
|
|
msg114654 - (view) |
Author: Mark Lawrence (BreamoreBoy) * |
Date: 2010-08-22 09:30 |
The are no unit test or doc changes with the patch. Can anyone answer Georg's question on ? |
|
|
msg161764 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2012-05-28 10:43 |
See also . Instead of codecs.StreamWriter better to use io.TextIOWrapper, because the first is slower and has numerous flaws. |
|
|
msg161767 - (view) |
Author: Walter Dörwald (doerwalter) *  |
Date: 2012-05-28 11:07 |
An alternative would be to use an incremental encoder instead of a StreamWriter. (Which is what TextIOWrapper does internally). |
|
|
msg161933 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2012-05-30 07:57 |
Oh, I see XMLGenerator completely outdated. It even has not been ported to Python 3. See function _write: def _write(self, text): if isinstance(text, str): self._out.write(text) else: self._out.write(text.encode(self._encoding, _error_handling)) In Python 2 there was a choice between bytes and unicode strings. But in Python 3 encoding never happens. XMLGenerator does not distinguish between binary and text streams. Here is a patch that fixes the work of XMLGenerator in Python 3. Unfortunately, it is impossible to avoid the loss of backward compatibility. I tried to keep the code to work for the most common cases, but some code which "worked" before may break (including I had to correct some tests). |
|
|
msg162851 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2012-06-15 07:20 |
The patch updated to reflect Martin's comments. I hope the old behavior now preserved in the most used in practice cases. Tests converted to work with bytes instead of strings. |
|
|
msg163740 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2012-06-24 07:20 |
It would be nice to fix this bug before forking of the 3.3.0b1 release clone. |
|
|
msg165509 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2012-07-15 07:08 |
Here is updated patch with more careful handling of closing (as for ) and added comments. |
|
|
msg172205 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2012-10-06 15:10 |
Ping. |
|
|
msg175472 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2012-11-12 20:44 |
If nobody has any objections, why not apply this patch? |
|
|
msg178326 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2012-12-27 20:45 |
If no one objects I will commit this next year. |
|
|
msg178369 - (view) |
Author: Georg Brandl (georg.brandl) *  |
Date: 2012-12-28 07:26 |
I'd like Antoine to have a look at all that io stuff. It looks quite bloated. In your except clause, you're not calling self._close. |
|
|
msg179942 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2013-01-14 13:35 |
Patch updated. Fixed an error which Georg have found. Restored testing XMLGenerator with StringIO as Antoine pointed. Now XMLGenerator tested for StringIO, BytesIO and an user writer. Added tests for encoding. |
|
|
msg180297 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2013-01-20 15:32 |
Patch updated. Now I get rid of __del__ to prevent hanging on reference cicles as Antoine suggested on IRC. Added test for check that XMLGenerator doesn't close the file passed as argument. |
|
|
msg181797 - (view) |
Author: Roundup Robot (python-dev)  |
Date: 2013-02-10 12:38 |
New changeset 010b455de0e0 by Serhiy Storchaka in branch '2.7': Issue #1470548: XMLGenerator now works with UTF-16 and UTF-32 encodings. http://hg.python.org/cpython/rev/010b455de0e0 New changeset 66f92f76b2ce by Serhiy Storchaka in branch '3.2': Issue #1470548: XMLGenerator now works with binary output streams. http://hg.python.org/cpython/rev/66f92f76b2ce New changeset 03b878d636cf by Serhiy Storchaka in branch '3.3': Issue #1470548: XMLGenerator now works with binary output streams. http://hg.python.org/cpython/rev/03b878d636cf New changeset 12d75ca12ae7 by Serhiy Storchaka in branch 'default': Issue #1470548: XMLGenerator now works with binary output streams. http://hg.python.org/cpython/rev/12d75ca12ae7 |
|
|
msg182819 - (view) |
Author: Arfrever Frehtes Taifersar Arahesis (Arfrever) *  |
Date: 2013-02-23 20:50 |
The change in 2.7 branch breaks some software, including a test of Django (produce_xml_fragment from https://github.com/django/django/blob/1.4.5/tests/regressiontests/test_utils/tests.py). The problem seems to not occur with Python 3.2, 3.3 and 3.4. Before 010b455de0e0: >>> from StringIO import StringIO >>> from xml.sax.saxutils import XMLGenerator >>> stream = StringIO() >>> xml = XMLGenerator(stream, encoding='utf-8') >>> xml.startElement("foo", {"aaa": "1.0", "bbb": "2.0"}) >>> xml.characters("Hello") >>> xml.endElement("foo") >>> xml.startElement("bar", {"ccc": "3.0", "ddd": "4.0"}) >>> xml.endElement("bar") >>> stream.getvalue() 'Hello' >>> After 010b455de0e0: >>> from StringIO import StringIO >>> from xml.sax.saxutils import XMLGenerator >>> stream = StringIO() >>> xml = XMLGenerator(stream, encoding='utf-8') >>> xml.startElement("foo", {"aaa": "1.0", "bbb": "2.0"}) >>> xml.characters("Hello") >>> xml.endElement("foo") >>> xml.startElement("bar", {"ccc": "3.0", "ddd": "4.0"}) >>> xml.endElement("bar") >>> stream.getvalue() '' >>> |
|
|
msg182861 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2013-02-24 09:08 |
Thank you for report. Here is a patch which fixes this bug. |
|
|
msg182892 - (view) |
Author: Arfrever Frehtes Taifersar Arahesis (Arfrever) *  |
Date: 2013-02-24 20:52 |
This patch works for me. |
|
|
msg182930 - (view) |
Author: Roundup Robot (python-dev)  |
Date: 2013-02-25 11:32 |
New changeset d707e3345a74 by Serhiy Storchaka in branch '2.7': Issue #1470548: Do not buffer XMLGenerator output. http://hg.python.org/cpython/rev/d707e3345a74 |
|
|
msg182931 - (view) |
Author: Roundup Robot (python-dev)  |
Date: 2013-02-25 11:49 |
New changeset 1c03e499cdc2 by Serhiy Storchaka in branch '3.2': Issue #1470548: Add test for fragment producing with XMLGenerator. http://hg.python.org/cpython/rev/1c03e499cdc2 New changeset 5a4b3094903f by Serhiy Storchaka in branch '3.3': Issue #1470548: Add test for fragment producing with XMLGenerator. http://hg.python.org/cpython/rev/5a4b3094903f New changeset 810d70fb17a2 by Serhiy Storchaka in branch 'default': Issue #1470548: Add test for fragment producing with XMLGenerator. http://hg.python.org/cpython/rev/810d70fb17a2 |
|
|
msg185644 - (view) |
Author: Sebastian Ortiz Vasquez (neoecos) |
Date: 2013-03-31 19:33 |
I have been working with this in order to generate an RSS feed using web2py. I found, XMLGenerator method does not validate if is an unicode or string type, and it does not encode accord the encoding parameter of the XMLGenerator. I added changed the method to verify if is an unicode object or try to convert to it using the desired encoding. Recall that the _write UnbufferedTextIOWrapper receives an unicode object as parameter. def characters(self, content): if isinstance(content, unicode): self._write(escape(content)) else: self._write(escape(unicode(content,self._encoding))) |
|
|
msg185682 - (view) |
Author: Arfrever Frehtes Taifersar Arahesis (Arfrever) *  |
Date: 2013-03-31 21:51 |
Sebastian Ortiz Vasquez: Please file a new issue and attach a patch (in unified format) instead of a whole Python module. |
|
|