[Python-Dev] Unpickling memory usage problem, and a proposed solution (original) (raw)
Dan Gindikin dgindikin at gmail.com
Fri Apr 23 23:11:34 CEST 2010
- Previous message: [Python-Dev] Unpickling memory usage problem, and a proposed solution
- Next message: [Python-Dev] Unpickling memory usage problem, and a proposed solution
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Alexandre Vassalotti <alexandre peadrop.com> writes:
On Fri, Apr 23, 2010 at 3:57 PM, Dan Gindikin <dgindikin gmail.com> wrote: > This wouldn't help our use case, your code needs the entire pickle > stream to be in memory, which in our case would be about 475mb, this > is on top of the 300mb+ data structures that generated the pickle > stream. > In that case, the best we could do is a two-pass algorithm to remove the unused PUTs. That won't be efficient, but it will satisfy the memory constraint.
That is for what I'm doing for us right now.
Another solution is to not generate the PUTs at all by setting the 'fast' attribute on Pickler. But that won't work if you have a recursive structure, or have code that requires that the identity of objects to be preserved.
We definitely have some cross links amongst the objects, so we need PUTs.
By the way, it is weird that the total memory usage of the data structure is smaller than the size of its respective pickle stream. What pickle protocol are you using?
Its highest protocol, but we have a bunch of extension types that get expanded into python tuples for pickling.
- Previous message: [Python-Dev] Unpickling memory usage problem, and a proposed solution
- Next message: [Python-Dev] Unpickling memory usage problem, and a proposed solution
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]