msg82437 - (view) |
Author: Mark Dickinson (mark.dickinson) *  |
Date: 2009-02-18 22:45 |
Two closely related issues in Python/marshal.c, involving writing and reading of variable-length objects (lists, strings, long integers, ...) (1) The w_object function in marshal contains many instances of code like the following: else if (PyList_CheckExact(v)) { w_byte(TYPE_LIST, p); n = PyList_GET_SIZE(v); w_long((long)n, p); for (i = 0; i < n; i++) { w_object(PyList_GET_ITEM(v, i), p); } } On a 64-bit platform there's potential loss of information here either in the cast "(long)n" (if sizeof(long) is 4), or in w_long itself (if sizeof(long) is 8). Note that w_long, despite its name, always writes exactly 4 bytes. There should at least be an exception raised here if n is not in the range [-2**31, 2**31). This would make marshalling of large objects illegal (rather than just wrong). A more involved fix would allow marshalling of objects of size >= 2**31. This would obviously involve changing the marshal format, and would make it impossible to marshal a large object on a 64-bit platform and then unmarshal it on a 32-bit platform. The latter may not really be a problem, since memory considerations ought to rule that out anyway. (2) In r_object (and possibly elsewhere) there are corresponding checks of the form: case TYPE_LIST: n = r_long(p); if (n < 0 | |
n > INT_MAX) { PyErr_SetString(PyExc_ValueError, "bad marshal data"); retval = NULL; break; } ... if we allow marshalling of objects with more than 2**31-1 elements then these error checks can be relaxed. (And as a matter of principle, INT_MAX isn't really right here: an int might be only 16 bits long on some strange platforms...). |
|
msg82440 - (view) |
Author: Raymond Hettinger (rhettinger) *  |
Date: 2009-02-18 23:02 |
Given that marshal is primarily about supporting pyc files, do we care? |
|
|
msg82570 - (view) |
Author: Mark Dickinson (mark.dickinson) *  |
Date: 2009-02-21 17:32 |
It wouldn't hurt to add the overflow checks though, would it? |
|
|
msg82583 - (view) |
Author: Raymond Hettinger (rhettinger) *  |
Date: 2009-02-21 21:08 |
Why not. Besides it ought to be fun to write the test case for this one :-) |
|
|
msg181535 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2013-02-06 15:51 |
Here is a patch for 3.x which adds checks for size overflow (only on 64-bit platform). |
|
|
msg181536 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2013-02-06 15:52 |
And here is a patch for 2.7. |
|
|
msg181537 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2013-02-06 15:56 |
> (And as a matter of principle, > INT_MAX isn't really right here: an int might be only 16 bits long on > some strange platforms...). AFAIK Python doesn't support such platforms (and C standard requites at least 32-bit ints). However there are strange platforms with 64-bit ints. That's why I used the explicit constant 0x7FFFFFFF. |
|
|
msg181554 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2013-02-06 18:33 |
Perhaps you could add a bigmem test for this? (assuming you have enough memory somewhere to test it) |
|
|
msg181560 - (view) |
Author: Mark Dickinson (mark.dickinson) *  |
Date: 2013-02-06 19:08 |
> (and C standard requites at least 32-bit ints) No, that's not true: see Annex E of the standard, where the minimum for INT_MAX is declared to be 32767. |
|
|
msg181561 - (view) |
Author: Stefan Krah (skrah) *  |
Date: 2013-02-06 19:10 |
Theoretically int may be 16 bits (C99): 5.2.4.2.1 Sizes of integer types <limits.h> -- maximum value for an object of type int INT_MAX +32767 // 2**15 − 1 I agree that Python wouldn't compile on such a platform anyway. |
|
|
msg181562 - (view) |
Author: Stefan Krah (skrah) *  |
Date: 2013-02-06 19:11 |
Hmm, Mark was faster. ;) |
|
|
msg181564 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2013-02-06 19:19 |
> Perhaps you could add a bigmem test for this? > (assuming you have enough memory somewhere to test it) I think this is possible. I now have some experience in the writing of bigmem tests. ;) |
|
|
msg181570 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2013-02-06 20:10 |
> No, that's not true: see Annex E of the standard, where the minimum for > INT_MAX is declared to be 32767. My fault. Do I have to support such a case in the code? |
|
|
msg181572 - (view) |
Author: Mark Dickinson (mark.dickinson) *  |
Date: 2013-02-06 20:25 |
> Do I have to support such a case in the code? No, I don't see any need for that: after all, you're making the code *more* portable by replacing those occurrences of INT_MAX with 0x7fffffff. :-) |
|
|
msg181573 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2013-02-06 20:29 |
Perhaps we should change signatures of w_string() and r_string() -- replace "int" by at least "long". |
|
|
msg181574 - (view) |
Author: Mark Dickinson (mark.dickinson) *  |
Date: 2013-02-06 20:38 |
Hmm, yes, I think that would also make sense. I missed those uses of int. I'll give in, though, and accept that >= 32-bit ints are a de facto standard, even if not de jure. |
|
|
msg182011 - (view) |
Author: Roundup Robot (python-dev)  |
Date: 2013-02-13 10:15 |
New changeset 385d982ce641 by Serhiy Storchaka in branch '2.7': Issue #5308: Raise ValueError when marshalling too large object (a sequence http://hg.python.org/cpython/rev/385d982ce641 New changeset e0464fa28c85 by Serhiy Storchaka in branch '3.2': Issue #5308: Raise ValueError when marshalling too large object (a sequence http://hg.python.org/cpython/rev/e0464fa28c85 New changeset b48e1cd2d3be by Serhiy Storchaka in branch '3.3': Issue #5308: Raise ValueError when marshalling too large object (a sequence http://hg.python.org/cpython/rev/b48e1cd2d3be New changeset ea36478a36ee by Serhiy Storchaka in branch 'default': Issue #5308: Raise ValueError when marshalling too large object (a sequence http://hg.python.org/cpython/rev/ea36478a36ee |
|
|
msg182014 - (view) |
Author: Roundup Robot (python-dev)  |
Date: 2013-02-13 10:34 |
New changeset 72e75ea25d00 by Serhiy Storchaka in branch '2.7': Fix tests for issue #5308. http://hg.python.org/cpython/rev/72e75ea25d00 New changeset 0407e5e5915e by Serhiy Storchaka in branch '3.3': Cleanup a test for issue #5308. http://hg.python.org/cpython/rev/0407e5e5915e New changeset e45f2fcf202c by Serhiy Storchaka in branch 'default': Cleanup a test for issue #5308. http://hg.python.org/cpython/rev/e45f2fcf202c |
|
|
msg182015 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2013-02-13 10:38 |
> Perhaps you could add a bigmem test for this? > (assuming you have enough memory somewhere to test it) Some tests require more than 252 GiB of memory. |
|
|
msg192093 - (view) |
Author: Christian Heimes (christian.heimes) *  |
Date: 2013-06-30 23:44 |
The macro W_SIZE at http://hg.python.org/cpython/file/dbdb6f7f9a1a/Python/marshal.c#l130 has introduced a reference leak at http://hg.python.org/cpython/file/dbdb6f7f9a1a/Python/marshal.c#l390 . Because W_SIZE returns in an error case the reference count of PyObject *utf8 isn't decreased. CID 1040640 (#1 of 1): Resource leak (RESOURCE_LEAK) leaked_storage: Variable "utf8" going out of scope leaks the storage it points to. |
|
|
msg192131 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2013-07-01 14:23 |
Good catch. There is another resource leak for general bytes-like object (the buffer can be not released). Here is a patch. |
|
|
msg192878 - (view) |
Author: Roundup Robot (python-dev)  |
Date: 2013-07-11 16:21 |
New changeset 1cf2c42af815 by Serhiy Storchaka in branch '2.7': Fix reference leaks introduced by the patch for issue #5308. http://hg.python.org/cpython/rev/1cf2c42af815 New changeset 8b99f2224c3a by Serhiy Storchaka in branch '3.3': Fix reference leaks introduced by the patch for issue #5308. http://hg.python.org/cpython/rev/8b99f2224c3a New changeset 19ed630d8d75 by Serhiy Storchaka in branch 'default': Fix reference leaks introduced by the patch for issue #5308. http://hg.python.org/cpython/rev/19ed630d8d75 |
|
|