touch some files when we use them by Eh2406 · Pull Request #6477 · rust-lang/cargo (original) (raw)
This is a small change to improve the ability for a third party subcommand to clean up a target folder. I consider this part of the push to experiment with out of tree GC, as discussed in #6229.
how it works?
This updates the modification time of a file in each fingerprint folder and the modification time of the intermediate outputs every time cargo checks that they are up to date. This allows a third party subcommand to look at the modification time of the timestamp file to determine the last time a cargo invocation required that file. This is far more reliable then the current practices of looking at the accessed
time. accessed
time is not available or disabled on many operating systems, and is routinely set by arbitrary other programs.
is this enough to be useful?
The current implementation of cargo sweep on master will automatically use this data with no change to the code. With this PR, it will work even on systems that do not update accessed
time.
This also allows a crude script to clean some of the largest subfolders based on each files modification time.
is this worth adding, or should we just build clean --outdated
into cargo?
I would love to see a clean --outdated
in cargo! However, I think there is a lot of design work before we can make something good enough to deserve the cargo teams stamp of approval. Especially as an in tree version will have to work with many use cases some of witch are yet to be designed (like distributed builds). Even just including cargo-sweep
s existing functionality opens a full bike shop about what arguments to take, and in what form (cargo-sweep
takes a days argument, but maybe we should have a minutes or a ISO standard time or ...). This PR, or equivalent, allows out of tree experimentation with all different interfaces, and is basically required for any LRU
based system. (For example Crater wants a GC that cleans files in an LRU
manner to maintain a target folder below a target size. This is not a use case that is widely enough needed to be worth adding to cargo but one supported by this PR.)
what are the downsides?
- There are legitimate performance concerns about writing so many small files during a NOP build.
- There are legitimate concerns about unnecessary wrights on read-only filesystems.
- If we add this, and it starts seeing widespread use, we may be de facto stabilizing the folder structure we use. (This is probably true of any system that allows out of tree experimentation.)
- This may not be an efficient way to store the data. (It does have the advantage of not needing different cargos to manipulate the same file. But if you have a better idea please make a suggestion.)