[Python-Dev] Darwin's realloc(...) implementation never shrinks allocations (original) (raw)
Tim Peters tim.peters at gmail.com
Mon Jan 3 06:13:22 CET 2005
- Previous message: [Python-Dev] Darwin's realloc(...) implementation never shrinks allocations
- Next message: [Python-Dev] Darwin's realloc(...) implementation never shrinks allocations
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
[Bob Ippolito]
Quite a few notable places in the Python sources expect realloc(...) to relinquish some memory if the requested size is smaller than the currently allocated size.
I don't know what "relinquish some memory" means. If it means something like "returns memory to the OS, so that the reported process size shrinks", then no, nothing in Python ever assumes that. That's simply because "returns memory to the OS" and "process size" aren't concepts in the C standard, and so nothing can be said about them in general -- not in theory, and neither in practice, because platforms (OS+libc combos) vary so widely in behavior here.
As a pragmatic matter, I expect that a production-quality realloc() implementation will at least be able to reuse released memory, provided that the amount released is at least half the amount originally malloc()'ed (and, e.g., reasonable buddy systems may not be able to do better than that).
This is definitely not true on Darwin, and possibly other platforms. I have tested this on OpenBSD and Linux, and the implementations on these platforms do appear to relinquish memory,
As above, don't know what this means.
but I didn't read the implementation. I haven't been able to find any documentation that states that realloc should make this guarantee,
realloc() guarantees very little; it certainly doesn't guarantee anything, e.g., about OS interactions or process sizes.
but I figure Darwin does this as an "optimization" and because Darwin probably can't resize mmap'ed memory (at least it can't from Python, but this probably means it doesn't have this capability at all).
It is possible to "fix" this for Darwin,
I don't understand what's "broken". Small objects go thru Python's own allocator, which has its own realloc policies and its own peculiarities (chiefly that pymalloc never free()s any memory allocated for small objects).
because you can ask the default malloc zone how big a particular allocation is, and how big an allocation of a given size will actually be (see: <malloc/malloc.h>). The obvious place to put this would be PyObjectRealloc, because this is at least called by PyStringResize (which will fix <http://python.org/sf/1092502>).
The diagnosis in the bug report seems to leave it pointing at socket.py's _fileobject.read(), although I suspect the real cause is in socketmodule.c's sock_recv(). We've had other reports of various problems when people pass absurdly large values to socket recv(). A better fix here would probably amount to rewriting sock_recv() to refuse to pass enormous numbers to the platform recv() (it appears that many platform recv() implementations simply don't expect a recv() argument to be much bigger than the native network buffer size, and screw up when that's not so).
Should I write up a patch that "fixes" this? I guess the best thing to do would be to determine whether the fix should be used at runtime, by allocating a meg or so, resizing it to 1 byte, and see if the size of the allocation changes. If the size of the allocation does change, then the system realloc can be trusted to do what Python expects it to do, otherwise realloc should be done "cleanly" by allocating a new block (returning the original on failure, because it's good enough and some places in Python seem to expect that shrink will never fail),
Yup, that assumption (that a non-growing realloc can't fail) is all over the place.
memcpy, free, return new block.
I wrote up a small hack that does this realloc indirection to CVS trunk, and it doesn't seem to cause any measurable difference in pystone performance. Note that all versions of Darwin that I've looked at (6.x, 7.x, and 8.0b1 corresponding to publicly available WWDC 2004 Tiger code) have this "issue", but it might go away by Mac OS X 10.4 or some later release. This URL points to the sf bug and Darwin 7.7's realloc(...) implementation: http://bob.pythonmac.org/archives/2005/01/01/realloc-doesnt/
It would be good to rewrite sock_recv() more defensively in any case. Best I can tell, this implementation of realloc() is standard-conforming but uniquely brain dead in its downsize behavior. I don't expect the latter will last (as you say on your page, "probably plenty of other software" also makes the same pragmatic assumptions about realloc downsize behavior), so I'm not keen to gunk up Python to worm around it.
- Previous message: [Python-Dev] Darwin's realloc(...) implementation never shrinks allocations
- Next message: [Python-Dev] Darwin's realloc(...) implementation never shrinks allocations
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]