Cache usage meta tracking issue · Issue #7150 · rust-lang/cargo (original) (raw)
This issue is to help provide an overview of the different issues around Cargo's excessive disk usage, and tangentially, reducing compile time by reusing artifacts in a shared cache.
Cleaning outdated artifacts
Cargo's target
directory can grow substantially over time. It has limited capabilities to clean it with cargo clean
. Also, in general, cargo clean
has a fair number of bugs and is generally underwhelming.
Various issues and links of interest:
- cargo ./target fills with outdated artifacts as toolchains are updated/changed #5026 — cargo ./target fills with outdated artifacts as toolchains are updated/changed
- How to effectively clean target folder for CI caching #5885 — How to effectively clean target folder for CI caching
- Have an option to make Cargo attempt to clean up after itself. #6229 — Have an option to make Cargo attempt to clean up after itself.
- cargo update removes uninstalled deps from target/, when detected they have been removed from Cargo.toml #6435 — Remove artifacts for deps removed from Cargo.lock.
- cargo-sweep — A tool to prune unused files.
- The -Z mtime-on-use flag is an experiment to have Cargo update the mtime of used files to make it easier for tools like
cargo-sweep
to detect which files are stale.
I think a way forward here is to experiment and investigate different ways for tracking artifacts and last-use timestamps. mtime-on-use
has an issue with cached files in Docker. The filename hash is opaque and doesn't provide any insight into the metadata which would inform whether or not an artifact could be removed.
Cargo currently tracks a variety of things in different ways. It has a .json
fingerprint file which is generally unused (only for debug logging). It also has an invoked.timestamp
file used for some change tracking. And mtime information is used in a few different ways. It might be interesting to experiment with a different way to coordinate all this information. Perhaps a single, unified file tracking all artifacts, or changing the way the per-artifact .json
file works. The key points is that it must be fast and reliable, and should work well in Docker.
Cleaning cargo's home
Cargo's home directory ~/.cargo
grows without bounds. There is currently no built-in way to shrink it.
The cargo-cache package is the foremost way to manage it currently (besides rm -rf
). Ideally some of this would be a built-in capability of Cargo.
The main issue tracking this is #3289 — cargo clean ~/.cargo
.
There has not been much discussion about this. Ideally cargo would have this capability built in, perhaps with some of the easier/safer tasks automated on a periodic basis.
Reusing shared dependencies
sccache is the primary way to share artifacts across projects. It is also possible to share targets with setting the CARGO_TARGET_DIR
environment variable.
Issues:
- Suggestion: re-use built dependencies across directories #4301 — Suggestion: re-use built dependencies across directories
- Cache compilations of everything from crates.io #4436 — Cache compilations of everything from crates.io
- Per-user compiled artifact cache #5931 — Per-user compiled artefact cache
Since this has the potential to use a substantial amount of disk space, it would be desirable to have better support for pruning as listed above.
There are a fairly large number of tools which dig into the target
directory. They would all be broken by this change, so we would need to figure out a strategy for migration before doing this. I began this in #6668, but I have not finished. Ideally #6668 and #6577 would be finished before making this change.