Issue 22849: Double DECREF in TextIOWrapper (original) (raw)
There's a reproducible bug in textio.c that causes a double DECREF on codecs. The conditions to trigger are probably rare in real life, so not remotely exploitable (sandbox escape is the worst I can think of on its own, and I'm not aware of any on 3.x):
- You need to create a TextIOWrapper wrapping a file-like object that only partially supports the protocol. For example, supporting readable(), writable(), and seekable() but not tell().
The crash I experience most of the time appears to be that the memory being reused, such that the PyObject ob_type field is no longer a valid pointer.
Affected: Source 3.5.0a0 (latest default branch yesterday, 524a004e93dd) Archlinux: 3.3.5 and 3.4.2 Ubuntu: 3.4.0 Unaffected: Centos: 3.3.2 All 2.7 branch (doesn't contain the faulty commit)
Here's where it's introduced -- https://hg.python.org/cpython/rev/f3ec00d2b75e/#l5.76
/* Modules/_io/textio.c line 1064 */
Py_DECREF(codec_info); /* does not set codec_info = NULL; */ ... if(...) goto error; ... error: Py_XDECREF(codec_info);
The attached script is close to minimal -- I think at most you can reduce by one TextIOWrapper instantiation. Sample stacktrace follows (which is after the corruption occurs, on subsequent access to v->ob_type (which is invalid).
#0 0x00000000004c8829 in PyObject_GetAttr (v=<unknown at remote 0x7ffff7eb9688>, name='_is_text_encoding') at Objects/object.c:872 #1 0x00000000004c871d in _PyObject_GetAttrId (v=<unknown at remote 0x7ffff7eb9688>, name=0x945d50 <PyId__is_text_encoding.10143>) at Objects/object.c:835 #2 0x00000000005c6674 in _PyCodec_LookupTextEncoding ( encoding=0x7ffff6f40220 "utf-8", alternate_command=0x6c2fcd "codecs.open()") at Python/codecs.c:541 #3 0x000000000064286e in textiowrapper_init (self=0x7ffff7f9ecb8, args=(<F at remote 0x7ffff6f40a18>,), kwds={'encoding': 'utf-8'}) at ./Modules/_io/textio.c:965