Foundations for non-linear solver and polymorphic application by ilevkivskyi · Pull Request #15287 · python/mypy (original) (raw)

Conversation

This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters

[ Show hidden characters]({{ revealButtonHref }})

ilevkivskyi added a commit that referenced this pull request

Aug 3, 2023

This is a first follow-up for #15287 (I like how my PR titles sound like research paper titles, LOL)

This PR completes the new type inference foundations by switching to a complete and well founded algorithm [1] for transitive closure (that replaces more ad hoc initial algorithm that covered 80% of cases and was good for experimenting with new inference scheme). In particular the algorithm in this PR covers two important edge cases (see tests). Some comments:

I don't intend to switch the default for --new-type-inference, I just want to see the effect of the switch on mypy_primer, I will switch back to false before merging
This flag is still not ready to be publicly announced, I am going to make another 2-3 PRs from the list in #15287 before making this public.
I am not adding yet the unit tests as discussed in previous PR. This PR is already quite big, and the next one (support for upper bounds and values) should be much smaller. I am going to add unit tests only for transitive_closure() which is the core of new logic.
While working on this I fixed couple bugs exposed in TypeVarTuple support: one is rare technical corner case, another one is serious, template and actual where swapped during constraint inference, effectively causing outer/return context to be completely ignored for instances.
It is better to review the PR with "ignore whitespace" option turned on (there is big chunk in solve.py that is just change of indentation).
There is one questionable design choice I am making in this PR, I am adding extra_tvars as an attribute of Constraint class, while it logically should not be attributed to any individual constraint, but rather to the full list of constrains. However, doing this properly would require changing the return type of infer_constrains() and all related functions, which would be a really big refactoring.

[1] Definition 7.1 in https://inria.hal.science/inria-00073205/document

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

ilevkivskyi added a commit that referenced this pull request

Aug 9, 2023

This is a third PR in series following #15287 and #15754. This one is quite simple: I just add basic support for polymorphic inference involving type variables with upper bounds and values. A complete support would be quite complicated, and it will be a corner case to already rare situation. Finally, it is written in a way that is easy to tune in the future.

I also use this PR to add some unit tests for all three PRs so far, other two PRs only added integration tests (and I clean up existing unit tests as well).

ilevkivskyi added a commit that referenced this pull request

Aug 15, 2023

…as (#15837)

This is a third follow-up for #15287 (likely there will be just one more PR, for TypeVarTuples, and few less important items I mentioned in the original PR I will leave for more distant future).

After all this PR turned out to be larger than I wanted. The problem is that Concatenate support for ParamSpec was quite broken, and this caused many of my tests fail. So I decided to include some major cleanup in this PR (I tried splitting it into a separate PR but it turned out to be tricky). After all, if one ignores added tests, it is almost net zero line count.

The main problems that I encountered are:

First, valid substitutions for a ParamSpecType were: another ParamSpecType, Parameters, and CallableType (and also AnyType and UninhabitedType but those seem to be handled trivially). Having CallableType in this list caused various missed cases, bogus get_proper_type()s, and was generally counter-intuitive.
Second (and probably bigger) issue is that it is possible to represent Concatenate in two different forms: as a prefix for ParamSpecType (used mostly for instances), and as separate argument types (used mostly for callables). The problem is that some parts of the code were implicitly relying on it being in one or the other form, while some other code uncontrollably switched between the two.

I propose to fix this by introducing some simplifications and rules (some of which I enforce by asserts):

Only valid non-trivial substitutions (and consequently upper/lower bound in constraints) for ParamSpecType are ParamSpecType and Parameters.
When ParamSpecType appears in a callable it must have an empty prefix.
Parameters cannot contain other Parameters (and ideally also ParamSpecTypes) among argument types.
For inference we bring Concatenate to common representation (because both callables and instances may appear in the same expression). Using the ParamSpecType representation with prefix looks significantly simpler (especially in solver).

Apart from this actual implementation of polymorphic inference is simple/straightforward, I just handle the additional ParamSpecType cases (in addition to TypeVarType) for inference, for solver, and for application. I also enabled polymorphic inference for lambda expressions, since they are handled by similar code paths.

Some minor comments:

I fixed couple minor bugs uncovered by this PR (see e.g. test case for accidental TypeVar id clash).
I switch few tests to --new-type-inference because there error messages are slightly different, and so it is easier for me to test global flip to True locally.
I may tweak some of the "ground rules" if mypy_primer output will be particularly bad.

Co-authored-by: Ivan Levkivskyi ilevkivskyi@hopper.com

ilevkivskyi added a commit that referenced this pull request

Aug 18, 2023

This is the fifth PR in the series started by #15287, and a last one for the foreseeable future. This completes polymorphic inference sufficiently for extensive experimentation, and enabling polymorphic fallback by default.

Remaining items for which I am going to open follow-up issues:

Enable --new-type-inference by default (should be done before everything else in this list).
Use polymorphic inference during unification.
Use polymorphic inference as primary an only mechanism, rather than a fallback if basic inference fails in some way.
Move apply_poly() logic from checkexpr.py to applytype.py (this one depends on everything above).
Experiment with backtracking in the new solver.
Experiment with universal quantification for types other that Callable (btw we already have a hacky support for capturing a generic function in an instance with ParamSpec).

Now some comments on the PR proper. First of all I decided to do some clean-up of TypeVarTuple support, but added only strictly necessary parts of the cleanup here. Everything else will be in follow up PR(s). The polymorphic inference/solver/application is practically trivial here, so here is my view on how I see large-scale structure of TypeVarTuple implementation:

There should be no special-casing in applytype.py, so I deleted everything from there (as I did for ParamSpec) and complemented visit_callable_type() in expandtype.py. Basically, applytype.py should have three simple steps: validate substitutions (upper bounds, values, argument kinds etc.); call expand_type(); update callable type variables (currently we just reduce the number, but in future we may also add variables there, see TODO that I added).
The only valid positions for a variadic item (a.k.a. UnpackType) are inside Instances, TupleTypes, and CallableTypes. I like how there is an agreement that for callables there should never be a prefix, and instead prefix should be represented with regular positional arguments. I think that ideally we should enforce this with an assert in CallableType constructor (similar to how I did this for ParamSpec).
Completing expand_type() should be a priority (since it describes basic semantics of TypeVarLikeTypes). I think I made good progress in this direction. IIUC the only valid substitution for *Ts are TupleType.items, *tuple[X, ...], Any, and <nothing>, so it was not hard.
I propose to only allow TupleType (mostly for semanal.py, see item below), plain TypeVarTupleType, and a homogeneous tuple instances inside UnpackType. Supporting unions of those is not specified by the PEP and support will likely be quite tricky to implement. Also I propose to even eagerly expand type aliases to tuples (since there is no point in supporting recursive types like A = Tuple[int, *A]).
I propose to forcefully flatten nested TupleTypes, there should be no things like Tuple[X1, *Tuple[X2, *Ts, Y2], Y1] etc after semantic analysis. (Similarly to how we always flatten Parameters for ParamSpec, and how we flatten nested unions in UnionType constructor). Currently we do the flattening/normalization of tuples in expand_type() etc.
I suspect build_constraints_for_unpack() may be broken, at least when it was used for tuples and callables it did something wrong in few cases I tested (and there are other symptoms I mentioned in a TODO). I therefore re-implemented logic for callables/tuples using a separate dedicated helper. I will investigate more later.

As I mentioned above I only implemented strictly minimal amount of the above plan to make my tests pass, but still wanted to write this out to see if there are any objections (or maybe I don't understand something). If there are no objections to this plan, I will continue it in separate PR(s). Btw, I like how with this plan we will have clear logical parallels between TypeVarTuple implementation and (recently updated) ParamSpec implementation.

Co-authored-by: Ivan Levkivskyi ilevkivskyi@hopper.com

This was referenced

Aug 18, 2023