Issue 4367: segmentation fault in ast_for_atom (original) (raw)

Created on 2008-11-20 17:36 by ot, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
ast.patch ot,2008-11-20 17:36
bad_unicodeerror.patch amaury.forgeotdarc,2008-11-20 18:39
use_PyObject_Str_and_test.patch benjamin.peterson,2008-11-21 03:42
use_PyObject_Str_and_test2.patch benjamin.peterson,2008-11-21 03:48 the same thing without the reference leak :)
use_PyObject_Str_and_test3.patch benjamin.peterson,2008-11-21 22:04
Messages (9)
msg76119 - (view) Author: Giuseppe Ottaviano (ot) Date: 2008-11-20 17:36
Hi all, trying to compile Python 2.6 I got a segmentation fault while byte-compiling the modules. The segmentation fault happened in ast_for_atom, parsing an Unicode entity. I found out that another problem prevented unicodedata to be imported, so unicode_decode raised an exception. I think the problem is in the following lines: if (PyErr_ExceptionMatches(PyExc_UnicodeError)){ PyObject *type, *value, *tback, *errstr; PyErr_Fetch(&type, &value, &tback); errstr = ((PyUnicodeErrorObject *)value)->reason; I'm not an expert of CPython internals, but the exception is raised with PyErr_SetString, so value is a PyStringObject, and that cast is invalid. Changing the last line to errstr = value; everything works. The patch is attached.
msg76124 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2008-11-20 18:26
Can you provide the code that caused the seg fault?
msg76125 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2008-11-20 18:39
I think I got the point: to decode strings like "\N{CHARACTER NAME}" PyUnicode_DecodeUnicodeEscape imports the unicodedata module. If this fails, PyErr_SetString(PyExc_UnicodeError, "some message") is called. The exception will eventually be caught by ast_for_atom (in Python/ast.c), but the exception there is not normalized: type is PyExc_UnicodeError when the value is a string. Hence the invalid cast. The exception cannot be normalized there: UnicodeError.__init__ needs 5 arguments. I think the error was to call PyErr_SetString in the first place. The attached patch replaces it with PyErr_SetObject and a full UnicodeDecodeError object. This deserves a unit test, but I wonder how to reliably make the import fail.
msg76130 - (view) Author: Giuseppe Ottaviano (ot) Date: 2008-11-20 19:08
@amaury: Yes, this is exactly what I noticed. I didn't know how to create an instance of a PyUnicodeErrorObject. BTW, isn't PyErr_SetString used throughout the code? Should all those calls replaced with PyErr_SetObject? @benjamin: The bug can be easily reproduced: brian:tmp ot$ echo 'raise Exception()' > unicodedata.py brian:tmp ot$ python2.6 -c "print u'\N{SOFT HYPHEN}'" Segmentation fault I don't know if this can be used as an unit test.
msg76163 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2008-11-21 03:42
Here's an alternative patch which simply calls PyObject_Str on the value of the exception. It also has a test.
msg76166 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2008-11-21 08:39
Yes, it's better to simply call str() on the exception. I was confused by the fact that UnicodeError is a simple exception, with a single message, unlike UnicodeDecodeError which has more attributes. About the patch: - The test does not work if test_ucn.py is run before: the unicodedata module is imported only once on the first \N{} decoding. You could try to run the test in a subprocess. - In ast.c, errstr should be DECREF'd.
msg76206 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2008-11-21 22:04
This new patch address review issues.
msg76210 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2008-11-21 22:21
Everything looks good. (20 lines of a complex test for two simple lines of code that nobody will ever run... unit tests is a religion)
msg76211 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2008-11-21 22:27
On Fri, Nov 21, 2008 at 4:21 PM, Amaury Forgeot d'Arc <report@bugs.python.org> wrote: > > Amaury Forgeot d'Arc <amauryfa@gmail.com> added the comment: > > Everything looks good. Thanks for the review. Fixed in r67320. > > (20 lines of a complex test for two simple lines of code that nobody > will ever run... unit tests is a religion) One can't say we didn't try. :)
History
Date User Action Args
2022-04-11 14:56:41 admin set github: 48617
2008-11-21 22:27:58 benjamin.peterson set status: open -> closedresolution: accepted -> fixed
2008-11-21 22:27:44 benjamin.peterson set messages: +
2008-11-21 22:21:24 amaury.forgeotdarc set keywords: - needs reviewresolution: acceptedmessages: +
2008-11-21 22:04:07 benjamin.peterson set files: + use_PyObject_Str_and_test3.patchmessages: +
2008-11-21 08:39:15 amaury.forgeotdarc set messages: +
2008-11-21 03:48:07 benjamin.peterson set files: + use_PyObject_Str_and_test2.patchtitle: Patch for segmentation fault in ast_for_atom -> segmentation fault in ast_for_atom
2008-11-21 03:42:49 benjamin.peterson set files: + use_PyObject_Str_and_test.patchmessages: +
2008-11-20 19:08:32 ot set messages: +
2008-11-20 18:39:32 amaury.forgeotdarc set keywords: + needs reviewfiles: + bad_unicodeerror.patchmessages: + nosy: + amaury.forgeotdarc
2008-11-20 18:26:01 benjamin.peterson set priority: highnosy: + benjamin.petersonmessages: +
2008-11-20 17:36:05 ot create