[Python-ideas] Preventing out of memory conditions (original) (raw)

Max Moroz maxmoroz at gmail.com
Wed Jan 2 13:06:14 CET 2013


On Mon, Dec 31, 2012 at 7:22 PM, Gregory P. Smith <greg at krypto.org> wrote:

Within CPython the way the C API is today it is too late by the time the code to raise a MemoryError has been called so capturing all places that could occur is not easy. Implementing this at the C level malloc later makes more sense. Have it dip into a reserved low memory pool to satisfy the current request and send the process a signal indicating it is running low. This approach would also work with C extension modules or an embedded Python.

Regarding the C malloc solution, wouldn't a callback be preferable to a signal? If I understood you correctly, signal implies that a different thread will handle it. At any reasonable size of the emergency memory pool is, there will be situations when the next memory allocation is greater than that size, leading to the very same problem you described later in your message when you talked about the disadvantage of polling. In addition, if the signal processing happens a bit slow (perhaps simply due to the thread scheduler being slow to switch), by the time enough memory is released, it may be too late - the next memory allocation may have already come in. Unless I'm missing something, the (synchronous) callback seems to be a strictly better than the (asynchronous) signal.

As to your main point that this functionality should be inside C malloc rather than pymalloc, I agree, but only if the objective is to provide an all-purpose, highly general "low memory condition" handling. (I'm not sure if malloc knows enough about the OS to define "low memory condition" well; but it's certain that pymalloc doesn't).

But I was going for a more modest goal. Rather than be warned of the pending for MemoryError exception, a developer could simply be notified via callback when the maximum absolute memory used by his app exceeds a certain limit. pymalloc could very easily call back a designated function when when the next memory allocation exceeds this threshold.

In many real-life situations, it's not that hard to estimate how much RAM the application should be allowed to consume. Sure, the developer would need to learn a little about the platforms his app is running on, and use OS-specific rules to set the memory limit, but that effort is modest, and the payoff is huge. Not to mention, a developer with a particularly technically savvy end users could even skip this work entirely by letting his end users set the memory limit per-session.

There is a huge advantage of the pymalloc solution (with the set memory limit) vs. the C malloc solution (with the generic low memory condition). On my system, I don't want the application to use (almost) all the available memory before it starts to manage its cache. In fact, by the time the physical memory use approaches my total physical RAM, the system slows down considerably as many other applications get swapped to disk by the OS. With a set memory limit, I can provide a much more granular control over the memory used by the application.

Of course, the set memory limit could also be implemented inside C malloc rather than inside pymalloc. But this requires that developers rewrite C runtime's memory manager on every platform, and then recompile their Python with it. The changes to pymalloc, on the other hand, would be relatively small.

I'd expect this already exists but I haven't looked for one.

All I found is this comment in XEmacs documentation about vm-limit.c: http://www.xemacs.org/Documentation/21.5/html/internals_17.html, but I'm not sure if it's XEmacs feature or if malloc itself supports it.

Having a thread polling memory use it not generally wise as that is polling rather than event driven and could easily miss low memory situations before it is too late and a failure has already happened (allocation demand can come in large spikes depending on the application).

Precisely. That's the problem with the best existing solutions (e.g., http://stackoverflow.com/a/7332782/336527).

OSes running processes in constrained environments or ones where the resources available can be reduced by the OS later may already send their own warning signals prior to outright killing the process but that should not preclude an application being able to monitor and constrain itself on its own without needing the OS to do it.

I was thinking about regular desktop OS, which certainly doesn't warn the process sufficiently in advance. The MemoryError exception basically tells the process that it's going to die soon, and there's nothing it can do about it.

Max



More information about the Python-ideas mailing list