Issue 5128: compileall: consider ctime (original) (raw)

Issue5128

Created on 2009-02-02 14:19 by gagern, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
compileall-ctime.patch gagern,2009-02-02 14:21 Use max(mtime, ctime) of source
compileall-timestamp1.patch gagern,2009-02-02 20:22 Use timestamp stored in compiled file
compileall-timestamp2.patch gagern,2009-02-02 20:42 Use timestamp and magic number
compileall-timestamp3.patch gagern,2009-02-04 09:33 Now with with and test
Messages (15)
msg80939 - (view) Author: Martin von Gagern (gagern) Date: 2009-02-02 14:19
When trying to decide whether or not a given file needs to be recompiled, the inode creation time should be taken into account along with the file modification time. Scenario: Suppose you have three times, A < B < C < D. At time A, you package version 1 of foo.py. At time B, you package version 2. At time C you install version 1 onto some system, and byte-compile it for all users. Therefore foo.py has mtime A and ctime C, and foo.pyc has both mtime C. At time D you delete foo.py from version 1 and install version 2. Then you byte-compile it without force. At that time, foo.py has mtime B (because that was when it was packaged) but ctime D (because that was when the file was created). foo.pyc has mtime C (because that was when it was last modified). The current comparison compares only mtimes, sees C > B, and skips foo.py. With this patch in place, it will check for C > max(B, D), which is not the case, and thus recompile foo.c. In other words, max(st_mtime, st_ctime) can be interpreted as the last time a file at a given path was changed, either through modification of its contents (mtime) or deletion followed by recreation (ctime). Installation systems that overwrite files without deleting them before still are in the same trouble as before, but all others will benefit from this. See also: http://bugs.gentoo.org/256953
msg80978 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2009-02-02 18:30
The problem is not that ctime should be taken into account, but that the .pyc file should be read for its timestamp and that should be used. Otherwise you are still deviating from what Python uses internally to decide whether bytecode should be regenerated.
msg80991 - (view) Author: Martin von Gagern (gagern) Date: 2009-02-02 20:22
Like this? Should the magic number be checked as well, and if so, against what value? I couldn't find that constant in any structure accessible from python, and jumping through hoops seems too much, as updating the python version should probably result in all system-wide modules being updated as well, at least on a well-maintained system.
msg80992 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2009-02-02 20:26
On Mon, Feb 2, 2009 at 12:22, Martin von Gagern <report@bugs.python.org> wrote: > > Martin von Gagern <Martin.vGagern@gmx.net> added the comment: > > Like this? Don't have the time right now to do a code review right now, but hopefully I can get to this soon. > Should the magic number be checked as well, If one is already reading the file you might as well. > and if so, > against what value? imp.get_magic().
msg80993 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2009-02-02 20:29
If you want an easy way to see how bytecode is checked, look at importlib._bootstrap in Python 3.1: http://svn.python.org/view/python/branches/py3k/Lib/importlib/_bootstrap.py?view=markup . Specifically, look at the get_code() method for _PyFileLoader.
msg80996 - (view) Author: Martin von Gagern (gagern) Date: 2009-02-02 20:42
Next iteration. With magic number, and now also closing the file again. I changed from unpack and number comparison to pack and string comparison, makes things a bit easier, as there is only one comparison, and as underflow of the packed data isn't an issue any more. Would you prefer the use of private marshal functions, as bootstrap does, over the struct packing as I do here? Looking forward to your review.
msg81003 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2009-02-02 21:51
On Mon, Feb 2, 2009 at 12:42, Martin von Gagern <report@bugs.python.org> wrote: > > Martin von Gagern <Martin.vGagern@gmx.net> added the comment: > > Next iteration. With magic number, and now also closing the file again. > I changed from unpack and number comparison to pack and string > comparison, makes things a bit easier, as there is only one comparison, > and as underflow of the packed data isn't an issue any more. > > Would you prefer the use of private marshal functions, as bootstrap > does, over the struct packing as I do here? > No since importlib only does it that way for bootstrapping considerations (might end up at the implementation of import itself so it has special requirements). > Looking forward to your review. I have not looked, Martin, but are proper tests in place already in the standard library? If not I would appreciate tests.
msg81004 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2009-02-02 21:52
On Mon, Feb 2, 2009 at 13:50, Brett Cannon <brett@python.org> wrote: > On Mon, Feb 2, 2009 at 12:42, Martin von Gagern <report@bugs.python.org> wrote: >> >> Martin von Gagern <Martin.vGagern@gmx.net> added the comment: >> >> Next iteration. With magic number, and now also closing the file again. >> I changed from unpack and number comparison to pack and string >> comparison, makes things a bit easier, as there is only one comparison, >> and as underflow of the packed data isn't an issue any more. >> >> Would you prefer the use of private marshal functions, as bootstrap >> does, over the struct packing as I do here? >> > > No since importlib only does it that way for bootstrapping > considerations (might end up at the implementation of import itself so > it has special requirements). > >> Looking forward to your review. > > I have not looked, Martin, but are proper tests in place already in > the standard library? If not I would appreciate tests. > Although, now that I said that, I have had testing issues myself when it comes to mtime on bytecode as Python tends to execute faster than the granularity the file system supports.
msg81127 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2009-02-04 03:54
Patch is really close. Can you use a context manager for the file management? That way the file is guaranteed to be closed without issue.
msg81132 - (view) Author: Martin von Gagern (gagern) Date: 2009-02-04 09:33
Not being a regular Python programmer myself, I've never even heard of a context manager before, but this latest patch should fit the bill. As for tests, I believe that a few two-second sleeps (as FAT has only two second resolution iirc) should avoid the kind of problem you describe. This latest patch includes unit tests, where the test_backdate fails for current implementation. The others are there to express even more basic assumptions, and can be dropped if you worry about those seconds spent running obvious tests. The tests are based on test_dircache, from which I copied and adapted the infrastructure for dealing with a temporary directory.
msg81392 - (view) Author: Martin von Gagern (gagern) Date: 2009-02-08 17:15
Any progress with the review? By the way, what branch are we aiming for? I'm using 2.5 here, so I reported this issue against that version, and wrote the patch against that branch. I guess it should work for trunk as well, but the imports of with_statement from __future__ should probably be omitted there.
msg81415 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2009-02-08 20:12
On Sun, Feb 8, 2009 at 09:15, Martin von Gagern <report@bugs.python.org> wrote: > > Martin von Gagern <Martin.vGagern@gmx.net> added the comment: > > Any progress with the review? > I am planning to get to it on Tuesday. > By the way, what branch are we aiming for? 2.7/3.1. It's a backwards-incompatible change so I am not planning on backporting it.
msg81521 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2009-02-10 02:11
Committed in 69481 and 69482 for trunk and py3k, respectively. Had to rewrite the test code but the compileall patch went in fine. Thanks, Martin!
msg173387 - (view) Author: Djoume Salvetti (djoume) Date: 2012-10-20 12:29
This patch introduces a slight change in behaviour. Now compilation will only happen if there is a difference in the number of seconds in the timestamp of files, before this patch any difference in mtime will trigger a rebuild. This is because the timestamp in the pyc file is an integer number of seconds, while mtime from stat is a float time.
msg173450 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2012-10-21 14:35
There is no patch, Djoume, but honestly that's fine since if you want to submit a change to something it should go in a new issue. But honestly compileall needs to be rewritten in Python 3.4 to use importlib and have it control when source code should be rebuilt since the .pyc format changed in Python 3.3 to include source file size.
History
Date User Action Args
2022-04-11 14:56:45 admin set github: 49378
2012-11-13 07:01:24 eric.snow set nosy: + eric.snow
2012-10-21 14:35:05 brett.cannon set messages: +
2012-10-20 12:29:27 djoume set nosy: + djoumemessages: +
2009-02-10 02:11:32 brett.cannon set status: open -> closedresolution: fixedmessages: + stage: patch review -> resolved
2009-02-09 23:44:53 ingmar set nosy: + ingmar
2009-02-08 20:12:20 brett.cannon set messages: +
2009-02-08 17:15:58 gagern set messages: +
2009-02-04 09:33:23 gagern set files: + compileall-timestamp3.patchmessages: +
2009-02-04 03:54:49 brett.cannon set messages: +
2009-02-02 21:52:01 brett.cannon set messages: +
2009-02-02 21:51:01 brett.cannon set messages: +
2009-02-02 20:42:35 gagern set files: + compileall-timestamp2.patchmessages: +
2009-02-02 20:29:09 brett.cannon set messages: +
2009-02-02 20:27:13 brett.cannon set assignee: brett.cannonstage: patch review
2009-02-02 20:26:33 brett.cannon set messages: +
2009-02-02 20:22:49 gagern set files: + compileall-timestamp1.patchmessages: +
2009-02-02 18:30:51 brett.cannon set nosy: + brett.cannonmessages: +
2009-02-02 14:21:13 gagern set files: + compileall-ctime.patchkeywords: + patch
2009-02-02 14:19:34 gagern create