Refactor InstallRequirement · Issue #5051 · pypa/pip (original) (raw)
I've had this cooking in my head for a bit now. These aren't critical path for some feature but this definitely something I want to do to make pip more approachable for new contributors and make it easier to reason about the installation flow. I've hinted this before in #4713 (comment).
The idea is that, eventually, the resolver should consume Requirements, get Candidates during resolution and spew out Distributions. The Distributions would then be operated upon during installation/uninstallation etc, downloading/caching would deal with Candidates, listing/freezing/checking would deal with Distributions. This 3-class model works cleanly for each of these use cases.
Currently, all three "jobs" are served by InstallRequirement
(and InstallationCandidate :P), and it's sometimes hard to keep track of what's happening since there's a lot of stuff that's happening in that one class.
This would result in:
- sanity while actually keeping track of the installation flow
- make it easier later on to improve unit test coverage - scope for improving test speed
- better decoupling of UI elements from actual code -- easier Redesigning pip's output (for install, wheel and download commands) #4649
- ease when modifying behaviours for certain kinds of Distributions -- easier to implement PEP 517/518
I'm curious what @pypa/pip-committers and others think about this since, all in all, this work would result in changes in about a quarter of pip._internal, in terms of LOC.
What follows is basically the mind-dump for this (from about 4 months back, I think):
- Refactor Refactor Refactor! {confirm it looks fine, figure out path}
- Do away with all the respect for InstallRequirement and start chopping it.
- Rename
InstallationCandidate
toCandidateDistribution
- Move all "link" related information into
CandidateDistribution
- Create
pip._internal.models.requirement.Requirement
- Holds information that is given by the user
* Essentially, a helpful wrapper aroundpackaging.requirements.Requirement
pip._internal.models.candidate.CandidateDistribution
- Holds information fetched from the index
* package-version visible to the resolver
* url-hash visible to download logic
* TODO: check how is this different from currentInstallationCandidate
* TODO: Figure out how VCS looks with this?pip._internal.models.distribution.Distribution
- Describing the build, install, uninstall and other related behaviours for different kinds of distributions.
* might wrap apkg_resources.Distribution
?
*LegacySourceDistribution
(non PEP 518/517)
*ModernSourceDistribution
(PEP 518/517)
*WheelDistribution
* Maybe do this early?- An intermediator to hold the logic of conversion from
CandidateDistribution
to aDistribution
object
* need to "build" to get dependencies andDistribution
should hold that behaviour, notCandidateDistribution
.
The reason I want to use the pip._internal.models is to represent that these are essentially objects representing data stored elsewhere that would be operated upon and isn't that what models are, in some sense? :P
PS: This is after I am done with zazo and then bringing it in. zazo is the project where I'm making the resolver.