Safer data serialization with marshal module · Issue #113626 · python/cpython (original) (raw)
Feature or enhancement
The main purpose of the marshal
module -- serialization of precompiled module code objects. This requires support of code objects, strings for names, and other primitive Python types and simple collection types referred by the code object.
It allows to use it as more data generic serialization tool -- more limited than pickle
, but less limited than JSON. The marshal
module supports different versions of the format and backward compatible with all earlier versions. But only if the data does not contain code objects. The format of the code objects changed with every Python version, and this is not reflected in marshal format version. Loading marshal data created in different Python version has undefined behavior if the data contains a code object.
I propose to add a keyword-only parameter allow_code
with default value True in marshal
functions. Specifying allow_code=False
forbid saving and loading code objects. It allows to be safer when you load external data and to guarantee that the output can be safely loaded in other Python.