[issue3672] Ill-formed surrogates not treated as errors during encoding/decoding - Code Review (original) (raw)

Can't Edit Can't Publish+Mail Start Review Created: 16 years, 1 month ago by Martin v. Löwis Modified: 15 years, 4 months ago Reviewers: Benjamin, report Base URL: http://svn.python.org/view/\*checkout\*/python/branches/pep-0383/ Visibility: Public. Patch Set 1# Total comments: 16 Patch Set 2 : Addressed first round of comments# Created: 16 years, 1 month ago Download[raw] [tar.bz2] Unified diffs Side-by-side diffs Delta from patch set Stats (+200 lines, -21 lines) Patch M Doc/library/codecs.rst View 1 1 chunk +12 lines, -0 lines 0 comments Download M Lib/test/test_bytes.py View 1 1 chunk +2 lines, -2 lines 0 comments Download M Lib/test/test_codecs.py View 1 2 chunks +13 lines, -2 lines 0 comments Download M Lib/test/test_unicode.py View 1 1 chunk +3 lines, -3 lines 0 comments Download M Lib/test/test_unicodedata.py View 1 2 chunks +2 lines, -1 line 0 comments Download M Objects/unicodeobject.c View 1 11 chunks +72 lines, -11 lines 0 comments Download M Python/codecs.c View 1 3 chunks +92 lines, -0 lines 0 comments Download M Python/marshal.c View 1 2 chunks +4 lines, -2 lines 0 comments Download Messages Total messages: 2 Expand All Messages | Collapse All Messages Benjamin http://codereview.appspot.com/52081/diff/1/5 File Doc/library/codecs.rst (right): http://codereview.appspot.com/52081/diff/1/5#newcode326 Line 326: In addition, the following error handlers are specific ... 16 years, 1 month ago (2009-05-01 21:25:44 UTC)#1 http://codereview.appspot.com/52081/diff/1/5 File Doc/library/codecs.rst (right): http://codereview.appspot.com/52081/diff/1/5#newcode326 Line 326: In addition, the following error handlers are specific to only selected "In addition, the following error handlers are specific to a single codec." sounds better http://codereview.appspot.com/52081/diff/1/5#newcode335 Line 335: There should probably be a versionchanged directive indicating that "surrogates" was added in 3.1. http://codereview.appspot.com/52081/diff/1/6 File Lib/test/test_codecs.py (right): http://codereview.appspot.com/52081/diff/1/6#newcode544 Line 544: def test_surrogates(self): I think this should be split into 2 tests. "test_lone_surrogates" and "test_surrogate_handler". http://codereview.appspot.com/52081/diff/1/4 File Objects/unicodeobject.c (right): http://codereview.appspot.com/52081/diff/1/4#newcode157 Line 157: static PyObject *unicode_encode_call_errorhandler(const char *errors, These prototypes are longer than 80 chars some places. I don't think the arguments need to line up with the starting parenthesis. http://codereview.appspot.com/52081/diff/1/4#newcode2393 Line 2393: s, size, &exc, i-1, i, &newpos); "exc" is never Py_DECREFed. http://codereview.appspot.com/52081/diff/1/4#newcode4110 Line 4110: if (!PyUnicode_Check(repunicode)) { Is there a test of this case somewhere? http://codereview.appspot.com/52081/diff/1/2 File Python/codecs.c (right): http://codereview.appspot.com/52081/diff/1/2#newcode758 Line 758: if (PyObject_IsInstance(exc, PyExc_UnicodeEncodeError)) { I believe PyErr_GivenExceptionMatches is more appropriate here, but given the rest of the file uses PyObject_IsInstance, it likely doesn't matter. http://codereview.appspot.com/52081/diff/1/2#newcode771 Line 771: return NULL; This is leaks "object". Sign in to reply to this message. Martin v. Löwis Issues fixed in r72188. http://codereview.appspot.com/52081/diff/1/5 File Doc/library/codecs.rst (right): http://codereview.appspot.com/52081/diff/1/5#newcode326 Line 326: In addition, the following ... 16 years, 1 month ago (2009-05-02 09:44:03 UTC)#2 Issues fixed in r72188. http://codereview.appspot.com/52081/diff/1/5 File Doc/library/codecs.rst (right): http://codereview.appspot.com/52081/diff/1/5#newcode326 Line 326: In addition, the following error handlers are specific to only selected On 2009/05/01 21:25:44, Benjamin wrote: > "In addition, the following error handlers are specific to a single codec." > sounds better Done. http://codereview.appspot.com/52081/diff/1/5#newcode335 Line 335: On 2009/05/01 21:25:44, Benjamin wrote: > There should probably be a versionchanged directive indicating that "surrogates" > was added in 3.1. Done. http://codereview.appspot.com/52081/diff/1/6 File Lib/test/test_codecs.py (right): http://codereview.appspot.com/52081/diff/1/6#newcode544 Line 544: def test_surrogates(self): On 2009/05/01 21:25:44, Benjamin wrote: > I think this should be split into 2 tests. "test_lone_surrogates" and > "test_surrogate_handler". Done. http://codereview.appspot.com/52081/diff/1/4 File Objects/unicodeobject.c (right): http://codereview.appspot.com/52081/diff/1/4#newcode157 Line 157: static PyObject *unicode_encode_call_errorhandler(const char *errors, On 2009/05/01 21:25:44, Benjamin wrote: > These prototypes are longer than 80 chars some places. I don't think the > arguments need to line up with the starting parenthesis. Done. http://codereview.appspot.com/52081/diff/1/4#newcode2393 Line 2393: s, size, &exc, i-1, i, &newpos); On 2009/05/01 21:25:44, Benjamin wrote: > "exc" is never Py_DECREFed. Done. http://codereview.appspot.com/52081/diff/1/4#newcode4110 Line 4110: if (!PyUnicode_Check(repunicode)) { On 2009/05/01 21:25:44, Benjamin wrote: > Is there a test of this case somewhere? No. This is temporary - for PEP 383, I will have to support error handlers returning bytes here, also. http://codereview.appspot.com/52081/diff/1/2 File Python/codecs.c (right): http://codereview.appspot.com/52081/diff/1/2#newcode758 Line 758: if (PyObject_IsInstance(exc, PyExc_UnicodeEncodeError)) { On 2009/05/01 21:25:44, Benjamin wrote: > I believe PyErr_GivenExceptionMatches is more appropriate here, but given the > rest of the file uses PyObject_IsInstance, it likely doesn't matter. No. The interface is that an exception instance must be passed; GivenExceptionMatches would also allow for tuples and types. http://codereview.appspot.com/52081/diff/1/2#newcode771 Line 771: return NULL; On 2009/05/01 21:25:44, Benjamin wrote: > This is leaks "object". Done. Sign in to reply to this message. Expand All Messages Collapse All Messages

Issue 52081: [issue3672] Ill-formed surrogates not treated as errors during encoding/decoding (Closed)
Created 16 years, 1 month ago by Martin v. Löwis
Modified 15 years, 4 months ago
Reviewers: report_bugs.python.org, Benjamin
Base URL: http://svn.python.org/view/\*checkout\*/python/branches/pep-0383/
Comments: 16