[Python-Dev] Mysterious Python pyc file corruption problems (original) (raw)

Brett Cannon brett at python.org
Thu May 16 23:30:26 CEST 2013


On Thu, May 16, 2013 at 5:19 PM, Guido van Rossum <guido at python.org> wrote:

This reminds me of the following bug, which can happen when two processes are both writing the .pyc file and a third is reading it. First some background.

When writing a .pyc file, we use the following strategy:

- open the file for writing - write a dummy header (four null bytes) - write the .py file's mtime - write the marshalled code object - replace the dummy heaer with the correct magic word

Just so people know, this is how we used to do it. In importlib we write the entire file to a temp file and then to an atomic rename.

Even pycompile.py (used by compileall.py) uses this strategy.

py_compile as of Python 3.4 now just uses importlib directly, so it matches its semantics.

-Brett

When reading a .pyc file, we ignore it when the magic word isn't there (or when the mtime doesn't match that of the .py file exactly), and then we will write it back like described above. Now consider the following scenario. It involves three processes. - Two unrelated processes both start and want to import the same module. - They both see the .pyc file is missing/corrupt and decide to write it. - The first process finishing writing the file, writing the correct header. - Now a third process wants to import the module, sees the valid header, and starts reading the file. - However, while this is going on, the second process gets ready to write the file. - The second process truncates the file, writes the dummy header, and then stalls. - At this point the third process (which thought it was reading a valid file) sees an unexpected EOF because the file has been truncated. Now, this would explain the EOFError, but not necessarily the ValueError with "unknown type code". However, it looks like marshal doesn't always check for EOF immediately (sometimes it calls getc() without checking the result, and sometimes it doesn't check the error state after calling rstring()), so I think all the errors are actually explainable from this scenario. -- --Guido van Rossum (python.org/~guido)


Python-Dev mailing list Python-Dev at python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/brett%40python.org



More information about the Python-Dev mailing list