[Python-Dev] Mysterious Python pyc file corruption problems (original) (raw)

Brett Cannon brett at python.org
Wed May 15 23:34:02 CEST 2013


On Wed, May 15, 2013 at 4:58 PM, Barry Warsaw <barry at python.org> wrote:

I am looking into a particularly vexing Python problem on Ubuntu that manifests in several different ways. I think the problem is the same one described in http://bugs.python.org/issue13146 and I sent a message on the subject to the ubuntu-devel list: https://lists.ubuntu.com/archives/ubuntu-devel/2013-May/037129.html

I don't know what's causing the problem and have no way to reproduce it, but all the clues point to corrupt pyc files in Pythons < 3.3. The common way this manifests is a traceback on an import statement. The actual error can be a "ValueError: bad marshal data (unknown type code)" such as in http://pad.lv/1010077 or an "EOFError: EOF read where not expected" as in http://pad.lv/1060842. We have many more instances of both of these. Since both error messages come from marshal.c when trying to read the pyc for a module being imported, I suspect that something is causing the pyc files to get partially overwritten or corrupted. The workaround is always to essentially blow away the .pyc file and re-create it. (Various different techniques can be used, but they all boil down to the same thing.) Another commonality is that this bug -- so far -- has not been observed in any Python 3.3 code, only 3.2 and earlier, including 2.7 and 2.6. This strengthens my hypothesis, since importlib in Python 3.3 included an atomic rename of the .pyc file whereas older Pythons only do an exclusive open on the pyc files, but do not do an atomic rename AFAICT.

Just an FYI, the renaming has caught at least one person off-guard: http://bugs.python.org/issue17222, so you might have to be careful about considering a backport.

-Brett



More information about the Python-Dev mailing list