[Python-Dev] Unpickling memory usage problem, and a proposed solution (original) (raw)
Brett Cannon brett at python.org
Fri Apr 23 20:27:09 CEST 2010
- Previous message: [Python-Dev] Unpickling memory usage problem, and a proposed solution
- Next message: [Python-Dev] Unpickling memory usage problem, and a proposed solution
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Fri, Apr 23, 2010 at 11:11, Dan Gindikin <dgindikin at gmail.com> wrote:
We were having performance problems unpickling a large pickle file, we were getting 170s running time (which was fine), but 1100mb memory usage. Memory usage ought to have been about 300mb, this was happening because of memory fragmentation, due to many unnecessary "puts" in the pickle stream.
We made a pickletools.optimize inspired tool that could run directly on a pickle file and used pickletools.genops. This solved the unpickling problem (84s, 382mb). However the tool itself was using too much memory and time (1100s, 470mb), so I recoded it to scan through the pickle stream directly, without going through pickletools.genops, giving (240s, 130mb). Other people that deal with large pickle files are probably having similar problems, and since this comes up when dealing with large data it is precisely in this situation that you probably can't use pickletools.optimize or pickletools.genops. It feels like functionality that ought to be added to pickletools, is there some way I can contribute this?
The best next step is to open an issue at bugs.python.org and upload the patch. I can't make any guarantees on when someone will look at it or if it will get accepted, but putting the code there is your best bet for acceptance.
-Brett -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/python-dev/attachments/20100423/a40333f8/attachment-0001.html>
- Previous message: [Python-Dev] Unpickling memory usage problem, and a proposed solution
- Next message: [Python-Dev] Unpickling memory usage problem, and a proposed solution
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]