Issue 13863: import.c sometimes generates incorrect timestamps on Windows + NTFS (original) (raw)

Created on 2012-01-25 19:28 by mark.dickinson, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Messages (15)

msg151969 - (view)

Author: Mark Dickinson (mark.dickinson) * (Python committer)

Date: 2012-01-25 19:28

(Marking this as 'Interpreter Core' because the issue really seems to come from Python/import.c rather than from the py_compile module.)

On a Windows 7 VM (64-bit, NTFS) running Python 2.7 (also reproduced on a non-VM Windows 7 installation), I'm seeing the following surprising behaviour:

C:\Python27\Lib>dir decimal.py* Volume in drive C is Local Volume Serial Number is 5083-D43D

Directory of C:\Python27\Lib

06/10/2011 08:46 PM 219,383 decimal.py 01/25/2012 07:05 PM 165,322 decimal.pyc 01/25/2012 04:54 PM 169,386 decimal.pyo 3 File(s) 554,091 bytes 0 Dir(s) 966,266,880 bytes free

C:\Python27\Lib>python Enthought Python Distribution -- www.enthought.com Version: 7.2-2 (64-bit)

Python 2.7.2 |EPD 7.2-2 (64-bit)| (default, Sep 14 2011, 11:25:00) [MSC v.1500 64 bit (AMD64)] on win32 Type "packages", "demo" or "enthought" for more information.

import py_compile; py_compile.compile('decimal.py') import os; os.stat('decimal.pyc').st_mtime 1327518430.306176 import decimal os.stat('decimal.pyc').st_mtime 1327518443.9094632

Notice that in spite of calling py_compile.compile on decimal.py, the .pyc file is still regenerated on import. Relevant details: note that the decimal.py timestamp is in the summer, and that it's currently winter time. Also, my Windows 7 timezone setting is UTC, with 'automatically adjust for DST' set to true.

This bit me today when working with a 'post-install' script for an MSI installer for a large Python application, where the post-install script did a compileall.compile_path to make sure that all the .pyc files were up to date, and thereby avoid a long (> 90 second) application first startup time. It turned out that on application startup some of the .pyc files got regenerated while others didn't, and it took me far too long to notice that it was the .py files with summer timestamps that lead to .pyc regeneration, and that those with winter timestamps were okay.

When I set my timezone to plain UTC with no DST adjustments, the above issue disappears. Also, when I reset my timezone to that of Saudi Arabia (no DST), the issue also disappears.

The cause of the issue seems to be that:

(1) import.c uses an 'fstat' call to get mtime for a .py file. (2) on my Windows installation, the result of fstat on a file varies (a) with time of year, and (b) with computer DST and timezone settings. (No such problems on OS X.) (3) in contrast, py_compile uses os.stat(...).st_mtime, which appears to be invariant under time changes.

The difference between the 'fstat' result and the os.stat result can be seen directly:

Python 2.7.2 |EPD 7.2-2 (64-bit)| (default, Sep 14 2011, 11:25:00) [MSC v.1500 64 bit (AMD64)] on win32 Type "packages", "demo" or "enthought" for more information.

import decimal # make sure .pyc file is up to date import os; os.stat('decimal.py').st_mtime # actual mtime of .py file 1307738784.0 import struct; struct.unpack('LL', open('decimal.pyc', 'rb').read(8))[1] # stored mtime from .pyc; uses fstat 1307735184

This presumably also means that in this part of the world, .pyc files will be regenerated on import after any DST change, since the reported 'fstat' result used by import.c will no longer match the stored mtime timestamp in the .pyc file.

msg151972 - (view)

Author: Mark Dickinson (mark.dickinson) * (Python committer)

Date: 2012-01-25 20:07

Issue reproduced on stock Python 2.7 and Python 3.2 from www.python.org. I don't have a working 3.3 install on Windows, but looking at the source, it seems likely that it's an issue there, too.

msg152001 - (view)

Author: Mark Dickinson (mark.dickinson) * (Python committer)

Date: 2012-01-26 10:00

Also seen on Windows Vista; seems to be a general Windows + NTFS problem.

Changing title to make it clearer that this is a core language issue.

It seems as though the correct fix would be to use something like GetFileInformationByHandle in place of the fstat calls in import.c.

msg152009 - (view)

Author: Antoine Pitrou (pitrou) * (Python committer)

Date: 2012-01-26 14:15

Hmm, interesting. This is exactly what happened recently when debugging pyc timestamp issues under Windows: http://www.python.org/dev/buildbot/all/builders/x86%20Windows7%202.7/builds/1204/steps/test/logs/stdio

Some decoding of the above crash:

As a sidenote, we don't have any tests that the pyc file is re-used when it is fresh enough. Perhaps by running an interpreter in a subprocess with "-v" we could examine the verbose messages printed out in import.c.

It seems as though the correct fix would be to use something like GetFileInformationByHandle in place of the fstat calls in import.c.

We must probably also replace the stat() call (through _Py_stat) with GetFileAttributesEx, or make _Py_stat re-use GetFileAttributesEx.

msg153878 - (view)

Author: Martin v. Löwis (loewis) * (Python committer)

Date: 2012-02-21 16:44

For some reason that I'm unable to follow right now, fstat in the crt does this dance to obtain st_mtime:

Now, SystemTime is broken-down local time, and loctotime32 converts that to "UTC" using an explicit formula taking the current time zone into account. This should work correctly, though, as FileTimeToLocalFileTime also takes into account the current time zone.

Anybody analysing this further might want to write a C program that passes explicit time values into this procedure (see Vc/crt/src/fstat.c and dtoxtime.c for the exact source code), to find out what step exactly goes wrong.

msg165313 - (view)

Author: Antoine Pitrou (pitrou) * (Python committer)

Date: 2012-07-12 15:42

It shouldn't affect 3.3 anymore (importlib using the os module rather than direct C calls to msvcrt). Could someone check?

msg171482 - (view)

Author: Peter Russell (qwertyface)

Date: 2012-09-28 14:49

I can confirm that the current equivalent to Mark's original test case works as expected on default.

I recommend closing this issue.

msg175487 - (view)

Author: Eric Snow (eric.snow) * (Python committer)

Date: 2012-11-13 06:31

actually still valid for previous to 3.3

msg177615 - (view)

Author: Mark Dickinson (mark.dickinson) * (Python committer)

Date: 2012-12-16 17:35

Here's a possible fix for Python 2.7. Reviews from Windows experts would be very much appreciated; I don't really know what I'm doing here.

I'm still trying to work out how to test this effectively.

msg177622 - (view)

Author: Mark Dickinson (mark.dickinson) * (Python committer)

Date: 2012-12-16 19:21

Updated patch, with a test.

msg177632 - (view)

Author: Mark Dickinson (mark.dickinson) * (Python committer)

Date: 2012-12-16 21:51

Updated patch to address comments from Serhiy on Rietveld:

Also improved an error check in the C code: compare with INVALID_HANDLE_VALUE instead of comparing with -1 directly.

msg177893 - (view)

Author: Mark Dickinson (mark.dickinson) * (Python committer)

Date: 2012-12-21 20:10

Whoops. int.from_bytes doesn't exist in Python 2.7.

msg178014 - (view)

Author: Brian Curtin (brian.curtin) * (Python committer)

Date: 2012-12-23 20:01

Latest patch looks ok to me and the tests pass.

msg178071 - (view)

Author: Roundup Robot (python-dev) (Python triager)

Date: 2012-12-24 16:33

New changeset 002d5f3b240d by Mark Dickinson in branch '2.7': Issue #13863: fix incorrect .pyc timestamps on Windows / NTFS (apparently due to buggy fstat) http://hg.python.org/cpython/rev/002d5f3b240d

msg178072 - (view)

Author: Mark Dickinson (mark.dickinson) * (Python committer)

Date: 2012-12-24 16:35

Fixed for 2.7. I don't think this is worth fixing to 3.2, given that it's not an issue for 3.3.

Thanks Brian and Serhiy for reviewing!

History

Date

User

Action

Args

2022-04-11 14:57:26

admin

set

github: 58071

2012-12-24 16:35:40

mark.dickinson

set

status: open -> closed
resolution: fixed
messages: +

2012-12-24 16:33:48

python-dev

set

nosy: + python-dev
messages: +

2012-12-23 20:01:16

brian.curtin

set

messages: +

2012-12-21 20:10:52

mark.dickinson

set

files: + pyc_mtime4.patch

messages: +

2012-12-20 09🔞06

mark.dickinson

set

assignee: mark.dickinson

2012-12-16 21:51:06

mark.dickinson

set

files: + pyc_mtime3.patch

messages: +

2012-12-16 19:21:42

mark.dickinson

set

files: + pyc_mtime2.patch

messages: +

2012-12-16 17:35:30

mark.dickinson

set

files: + pyc_mtime.patch
keywords: + patch

2012-12-16 17:35:16

mark.dickinson

set

messages: +

2012-11-13 06:31:51

eric.snow

set

status: closed -> open

nosy: + eric.snow
messages: +

resolution: works for me -> (no value)

2012-11-13 06:30:56

eric.snow

set

status: open -> closed
resolution: works for me

2012-09-28 14:49:41

qwertyface

set

nosy: + qwertyface
messages: +

2012-07-12 15:42:12

pitrou

set

nosy: + mackeith

messages: +
versions: - Python 3.3

2012-07-12 15:40:55

pitrou

link

issue15333 superseder

2012-02-21 16:44:40

loewis

set

nosy: + loewis
messages: +

2012-01-26 14:17:03

pitrou

set

nosy: + vstinner

2012-01-26 14:15:43

pitrou

set

priority: normal -> high

nosy: + tim.golden, brian.curtin, pitrou
messages: +

components: + Windows
stage: needs patch

2012-01-26 10:00:59

mark.dickinson

set

messages: +
title: Using py_compile does not prevent recompilation due to 'bad mtime'. -> import.c sometimes generates incorrect timestamps on Windows + NTFS

2012-01-25 20:07:57

mark.dickinson

set

messages: +
versions: + Python 3.2, Python 3.3

2012-01-25 19:28:01

mark.dickinson

create