Issue 28284: Memory corruption due to size expansion (overflow) in _json.encode_basestring_ascii on 32 bit Python 2.7.12 (original) (raw)
Guido Vranken reports:
This results in a segmentation fault on 32 bit:
python -c "import _json; print _json.encode_basestring_ascii(unicode(chr(0x22)) * 0x2AAAAAAB)"
This is a tentative patch:
diff --git a/Modules/_json.c b/Modules/_json.c index fede6b1..022bbc8 100644 --- a/Modules/_json.c +++ b/Modules/_json.c @@ -212,7 +212,16 @@ ascii_escape_unicode(PyObject *pystr)
/* One char input can be up to 6 chars output, estimate 4 of these */
output_size = 2 + (MIN_EXPANSION * 4) + input_chars;
- max_output_size = 2 + (input_chars * MAX_EXPANSION);
- if (input_chars > (PY_SSIZE_T_MAX / MAX_EXPANSION) ) {
PyErr_SetString(PyExc_ValueError, "Size of escaped string
exceeds upper boundary");
return NULL;
- }
- max_output_size = (input_chars * MAX_EXPANSION);
- if ( max_output_size > (PY_SSIZE_T_MAX - 2)) {
PyErr_SetString(PyExc_ValueError, "Size of escaped string
exceeds upper boundary");
return NULL;
- }
- max_output_size += 2; rval = PyString_FromStringAndSize(NULL, output_size); if (rval == NULL) { return NULL;
But you still have to take account these things:
/* This is an upper bound */
if (new_output_size > max_output_size) {
new_output_size = max_output_size;
}
If this code within the if{} is reached then it merely truncates the amount of memory that is actually required, thereby creating another opportunity for overwrites?
And this:
Py_ssize_t new_output_size = output_size * 2;
might overflow, but since output_size is always positive (ie. in the range [0..PY_SSIZE_T_MAX]), an overflow would result in a negative value. That would subsequently be caught in _PyString_Resize():
int _PyString_Resize(PyObject **pv, Py_ssize_t newsize) { register PyObject *v; register PyStringObject *sv; v = *pv; if (!PyString_Check(v) || Py_REFCNT(v) != 1 || newsize < 0 || PyString_CHECK_INTERNED(v)) { *pv = 0; Py_DECREF(v); PyErr_BadInternalCall(); return -1; }
Nonetheless an overflow from positive to negative is undesirable (it's undefined behaviour).