Issue 4445: String allocations waste 3 bytes of memory on average. (original) (raw)

Created on 2008-11-27 12:14 by mark.dickinson, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
string_alloc.patch mark.dickinson,2008-11-28 21:44
Messages (12)
msg76495 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2008-11-27 12:14
There are a number of places in Objects/stringobject.c where memory is allocated for a string of length n using: PyObject_MALLOC(sizeof(PyStringObject) + n) On my computer (OS X 10.5.5/Intel), and, I suspect, on most common platforms, the PyStringObject struct is going to contain some number of bytes (probably 3) of trailing padding; the result is that the PyObject_MALLOC call above asks for 3 more bytes than are necessary, and on average the Python interpreter will waste 3 bytes of memory per string allocation. Is there any reason not to replace these calls with: PyObject_MALLOC(offsetof(PyStringObject, ob_sval) + n + 1) instead? Patch attached.
msg76498 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2008-11-27 12:54
Updated patch: fix overflow checks to use offsetof instead of sizeof as well.
msg76503 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2008-11-27 16:57
+1 on the idea. -1 on the current patch. Can you encapsulate this in a simpler macro? I find the proposed replacement to be cryptic and error-prone (i.e. hard to mentally verify its correctness and somewhat likely to be screwed-up by a future maintainer).
msg76512 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2008-11-27 18:42
Suggestion: #define PyStringObject_SIZE (offsetof(PyStringObject, ob_sval) + 1) /* Inline PyObject_NewVar */ - op = (PyStringObject *)PyObject_MALLOC(sizeof(PyStringObject) + size); /* Inline PyObject_NewVar */ + op = (PyStringObject *)PyObject_MALLOC(PyStringObject_SIZE + size);
msg76552 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2008-11-28 20:16
Yep. That works nicely. Here's a revised patch.
msg76553 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2008-11-28 20:27
Why is +1 required here? If I understand offsetof() correctly than it returns the position of the ob_sval element. Shouldn't PyStringObject + offsetof(PyStringObject, ob_sval) point to the first element of the ob_svall array?
msg76554 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2008-11-28 20:34
The +1 is there for the trailing null byte on the string: if s is a Python string with len(s) == n, then the ob_sval array needs space for n+1 characters.
msg76555 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2008-11-28 20:46
Ah! I forgot the trailing \0 byte ... Thanks Mark!
msg76558 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2008-11-28 21:44
Hmmm. test_sys fails on 64-bit build. Patch updated to fix this. All tests now pass on 32-bit and 64-bit, debug and non-debug builds.
msg76577 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2008-11-29 01:48
Looks much cleaner.
msg77078 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2008-12-05 21:56
Applied to the trunk in r67601. Will merge to other branches if the buildbots look okay.
msg77137 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2008-12-06 15:34
Applied to py3k in r67610.
History
Date User Action Args
2022-04-11 14:56:41 admin set github: 48695
2008-12-06 15:34:04 mark.dickinson set status: open -> closedmessages: +
2008-12-05 21:56:59 mark.dickinson set messages: +
2008-11-29 01:48:45 rhettinger set assignee: mark.dickinsonmessages: + resolution: acceptedversions: + Python 3.0
2008-11-28 21:44:55 mark.dickinson set files: - string_alloc.patch
2008-11-28 21:44:48 mark.dickinson set files: + string_alloc.patchmessages: +
2008-11-28 20:46:42 christian.heimes set messages: +
2008-11-28 20:34:09 mark.dickinson set messages: +
2008-11-28 20:27:46 christian.heimes set nosy: + christian.heimesmessages: +
2008-11-28 20:16:11 mark.dickinson set files: - string_alloc.patch
2008-11-28 20:16:04 mark.dickinson set files: + string_alloc.patchmessages: +
2008-11-27 18:42:24 rhettinger set messages: +
2008-11-27 16:57:32 rhettinger set nosy: + rhettingermessages: +
2008-11-27 12:54:47 mark.dickinson set files: - string_alloc.patch
2008-11-27 12:54:43 mark.dickinson set files: + string_alloc.patchmessages: +
2008-11-27 12:14:44 mark.dickinson create