[Python-Dev] PEP 574 (pickle 5) implementation and backport available (original) (raw)

Stefan Behnel stefan_ml at behnel.de
Sat May 26 03:12:43 EDT 2018


Antoine Pitrou schrieb am 25.05.2018 um 23:11:

On Fri, 25 May 2018 14:50:57 -0600 Neil Schemenauer wrote:

On 2018-05-25, Antoine Pitrou wrote:

Do you have something specific in mind?

I think compressed by default is a good idea. My quick proposal: - Use fast compression like lz4 or zlib with ZBESTSPEED - Add a 'compress' keyword argument with a default of None. For protocol 5, None means to compress. Providing 'compress' != None for older protocols will raise an error. The question is what purpose does it serve for pickle to do it rather than for the user to compress the pickle themselves. You're basically saving one line of code. Am I missing some other advantage?

Regarding the pickling side, if the pickle is large, then it can save memory to compress while pickling, rather than compressing after pickling. But that can also be done with file-like objects, so the advantage is small here.

I think a major advantage is on the unpickling side rather than the pickling side. Sure, users can compress a pickle after the fact, but if there's a (set of) standard algorithms that unpickle can handle automatically, then it's enough to pass "something pickled" into unpickle, rather than having to know (or figure out) if and how that pickle was originally compressed, and build up the decompression pipeline for it to get everything uncompressed efficiently without accidentally wasting memory or processing time.

Obviously, auto-decompression opens up a gate for compression bombs, but then, unpickling data from untrusted sources is discouraged anyway, so...

Stefan



More information about the Python-Dev mailing list