Issue 28284: Memory corruption due to size expansion (overflow) in _json.encode_basestring_ascii on 32 bit Python 2.7.12 (original) (raw)

Guido Vranken reports:

This results in a segmentation fault on 32 bit:

python -c "import _json; print _json.encode_basestring_ascii(unicode(chr(0x22)) * 0x2AAAAAAB)"

This is a tentative patch:

diff --git a/Modules/_json.c b/Modules/_json.c index fede6b1..022bbc8 100644 --- a/Modules/_json.c +++ b/Modules/_json.c @@ -212,7 +212,16 @@ ascii_escape_unicode(PyObject *pystr)

 /* One char input can be up to 6 chars output, estimate 4 of these */
 output_size = 2 + (MIN_EXPANSION * 4) + input_chars;

exceeds upper boundary");

exceeds upper boundary");

But you still have to take account these things:

        /* This is an upper bound */
        if (new_output_size > max_output_size) {
            new_output_size = max_output_size;
        }

If this code within the if{} is reached then it merely truncates the amount of memory that is actually required, thereby creating another opportunity for overwrites?

And this:

        Py_ssize_t new_output_size = output_size * 2;

might overflow, but since output_size is always positive (ie. in the range [0..PY_SSIZE_T_MAX]), an overflow would result in a negative value. That would subsequently be caught in _PyString_Resize():

int _PyString_Resize(PyObject **pv, Py_ssize_t newsize) { register PyObject *v; register PyStringObject *sv; v = *pv; if (!PyString_Check(v) || Py_REFCNT(v) != 1 || newsize < 0 || PyString_CHECK_INTERNED(v)) { *pv = 0; Py_DECREF(v); PyErr_BadInternalCall(); return -1; }

Nonetheless an overflow from positive to negative is undesirable (it's undefined behaviour).