[Python-Dev] Darwin's realloc(...) implementation never shrinks allocations (original) (raw)
Tim Peters tim.peters at gmail.com
Mon Jan 3 22:49:29 CET 2005
- Previous message: [Python-Dev] Darwin's realloc(...) implementation never shrinks allocations
- Next message: [Python-Dev] Darwin's realloc(...) implementation never shrinks allocations
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
[Tim Peters]
Ya, I understood that. My conclusion was that Darwin's realloc() implementation isn't production-quality. So it goes.
[Bob Ippolito]
Whatever that means.
Well, it means what it said. The C standard says nothing about performance metrics of any kind, and a production-quality implementation of C requires very much more than just meeting what the standard requires. The phrase "quality of implementation" is used in the C Rationale (but not in the standard proper) to cover all such issues. realloc() pragmatics are quality-of-implementation issues; the accuracy of fp arithmetic is another (e.g., if you get back -666.0 from the C 1.0 + 2.0, there's nothing in the standard to justify a complaint).
free() can be called either explicitly, or implicitly by calling realloc() with a size larger than the size of the allocation.
From later comments feigning outrage , I take it that "the size of the allocation" here does not mean the specific number the user passed to the previous malloc/realloc call, but means whatever amount of address space the implementation decided to use internally. Sorry, but I assumed it meant the former at first.
...
Was this a good decision? Probably not!
Sounds more like a bug (or two) to me than "a decision", but I don't know.
You said yourself that it is standards compliant ;) I have filed it as a bug, but it is probably unlikely to be backported to current versions of Mac OS X unless a case can be made that it is indeed a security flaw.
That's plausible. If you showed me a case where Python's list.sort() took cubic time, I'd certainly consider that to be "a bug", despite that nothing promises better behavior. If I wrote a malloc subsystem and somebody pointed out "did you know that when I malloc 1024**2+1 bytes, and then realloc(1), I lose the other megabyte forever?", I'd consider that to be "a bug" too (because, docs be damned, I wouldn't intentionally design a malloc subsystem with such behavior; and pymalloc does in fact copy bytes on a shrinking realloc in blocks it controls, whenever at least a quarter of the space is given back -- and it didn't at the start, and I considered that to be "a bug" when it was pointed out).
... Known case? No. Do I want to search Python application-space to find one? No.
Serious problems on a platform are usually well-known to users on that platform. For example, it was well-known that Python's list-growing strategy as of a few years ago fragmented address space horribly on Win9X. This was a C quality-of-implementation issue specific to that platform. It was eventually resolved by improving the list-growing strategy on all platforms -- although it's still the case that Win9X does worse on list-growing than other platforms, it's no longer a disaster for most list-growing apps on Win9X.
If there's a problem with "overallocate then realloc() to cut back" on Darwin that affects many apps, then I'd expect Darwin users to know about that already -- lots of people have used Python on Macs since Python's beginning, "mysterious slowdowns" and "mysterious bloat" get noticed, and Darwin has been around for a while.
..
There is no "choke point" for allocations in Python -- some places call the system realloc() directly. Maybe the latter matter on Darwin too, but maybe they don't. The scope of this hack spreads if they do.
...
In the case of Python, "some places" means "nowhere relevant". Four standard library extension modules relevant to the platform use realloc directly:
sre Uses realloc only to grow buffers. cPickle Uses realloc only to grow buffers. cStringIO Uses realloc only to grow buffers. regexpr: Uses realloc only to grow buffers.
Good!
If Zope doesn't use the allocator that Python gives it, then it can deal with its own problems. I would expect most extensions to use Python's allocator.
I don't know.
...
They're [#ifdef's] also the only good way to deal with platform-specific inconsistencies. In this specific case, it's not even possible to determine if a particular allocator implementation is stupid or not without at least using a platform-allocator-specific function to query the size reserved by a given allocation.
We've had bad experience on several platforms when passing large numbers to recv(). If that were addressed, it's unclear that Darwin realloc() behavior would remain a real issue. OTOH, it is clear that just worming around Darwin realloc() behavior won't help other platforms with problems in the same immediate area of bug 1092502. Gross over-allocation followed by a shrinking realloc() just isn't common in Python. sock_recv() is an exceptionally bad case. More typical is, e.g., fileobject.c's get_line(), where if "a line" exceed 100 characters the buffer keeps growing by 25% until there's enough room, then it's cut back once at the end. That typical use for shrinking realloc() just isn't going to be implicated in a real problem -- the over-allocation is always minor.
... There's obviously a tradeoff between copying lots of bytes and having lots of memory go to waste. That should be taken into consideration when considering how many pages could be returned to the allocator. Note that we can ask the allocator how much memory an allocation has actually reserved (which is usually somewhat larger than the amount you asked it for) and how much memory an allocation will reserve for a given size. An allocation resize wouldn't even show up as smaller unless at least one page would be freed (for sufficiently large allocations anyway, the minimum granularity is 16 bytes because it guarantees that alignment). Obviously if you have a lot of pages anyway, one page isn't a big deal, so we would probably only resort to free()/memcpy() if some fair percentage of the total pages used by the allocation could be rescued.
If it does end up causing some real performance problems anyway, there's always deeper hacks like using vmcopy(), a Darwin specific function which will do copy-on-write instead (which only makes sense if the allocation is big enough for this to actually be a performance improvement).
As above, I'm skeptical that there's a general problem worth addressing here, and am still under the possible illusion that the Mac developers will eventually change their realloc()'s behavior anyway. If you're convinced it's worth the bother, go for it. If you do, I strongly hope that it keys off a new platform-neutral symbol (say, Py_SHRINKING_REALLOC_COPIES) and avoids Darwin-specific implementation code. Then if it turns out that it is a broad problem (across apps or across platforms), everyone can benefit. PyObject_Realloc() seems the best place to put it. Unfortunately, for blocks obtained from the system malloc(), there is no portable way to find out how much excess was allocated in a release-build Python, so "avoids Darwin-specific implementation code" may be impossible to achieve. The more it can't be used on any platform other than this flavor of Darwin, the more inclined I am to advise just fixing the immediate problem (sock_recv's potentially unbounded over-allocation).
- Previous message: [Python-Dev] Darwin's realloc(...) implementation never shrinks allocations
- Next message: [Python-Dev] Darwin's realloc(...) implementation never shrinks allocations
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]