Issue 3322: bugs in scanstring_str() and scanstring_unicode() of _json module (original) (raw)

Created on 2008-07-08 22:58 by vstinner, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
_json.patch vstinner,2008-07-19 13:34 A patch to see the problem and maybe fix the crash
Messages (8)
msg69447 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2008-07-08 22:58
scanstring_str() and scanstring_unicode() functions don't end value whereas it can be outside input string range. A check like this is needed: if (end < 0 | len <= end) { PyErr_SetString(PyExc_ValueError, "xxx"); return NULL; } next is set to begin but few lines later (before first use of next), it's set to end: for (next = end; ...). In error message, eg. "Invalid control character at (...)", begin is used as character position but I think that the right position is in the variable "end" (or maybe "next"?). I'm unable to fix these functions because I don't understand the code.
msg70014 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2008-07-19 11:16
To reproduce the crash, try very big negative integer as second argument. Example: >>> _json.scanstring("test", -23492394) Erreur de segmentation (core dumped) >>> _json.scanstring(u"test", -1239239) Erreur de segmentation (core dumped)
msg70019 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2008-07-19 13:01
Bob, do you know how to fix this?
msg70025 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2008-07-19 13:34
I wrote that I'm unable to fix the bug correctly, but I wrote a patch to avoid the crash: - replace begin by end in error messages: is it correct? - use "end < 0 | len <= end" test to check scanstring() second argument => raise a ValueError if end value is invalid
msg70057 - (view) Author: Bob Ippolito (bob.ippolito) * (Python committer) Date: 2008-07-19 21:24
Am I to understand that the bug here is that the C extension doesn't validate input properly if you call into it directly? Without a test I'm not entirely sure exactly how you could possibly get negative values into those functions using the json module as-is.
msg70058 - (view) Author: Bob Ippolito (bob.ippolito) * (Python committer) Date: 2008-07-19 21:48
I've audited the patch, while it does fix the input range it looks like it regresses other things (at least the error messages). "begin" was intentionally used. The patch is not suitable for use, I'll create a minimal patch that just fixes input validation.
msg70059 - (view) Author: Bob Ippolito (bob.ippolito) * (Python committer) Date: 2008-07-19 22:00
I just committed a fix to trunk in r65147, needs port to py3k?
msg70063 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2008-07-20 07:26
Was merged in r65148.
History
Date User Action Args
2022-04-11 14:56:36 admin set github: 47572
2008-07-20 07:26:12 georg.brandl set status: open -> closedresolution: fixedmessages: +
2008-07-19 22:00:38 bob.ippolito set assignee: bob.ippolito -> georg.brandlmessages: +
2008-07-19 21:48:14 bob.ippolito set messages: +
2008-07-19 21:24:40 bob.ippolito set messages: +
2008-07-19 13:34:14 vstinner set files: + _json.patchkeywords: + patchmessages: +
2008-07-19 13:01:58 georg.brandl set assignee: bob.ippolitomessages: + nosy: + georg.brandl, bob.ippolito
2008-07-19 11:16:59 vstinner set messages: +
2008-07-08 22:58:09 vstinner create