[ty] add support for nonlocal statements by oconnor663 · Pull Request #19112 · astral-sh/ruff (original) (raw)
Here I think Literal[2] would be unsound. It's clearly wrong in this particular case, since g is never called, so x would still be 1 here. I don't think we want to get into detecting whether/when g is called, at least not for now, but we should aim to give a result that accurately encompasses the possibilities.
One option here is that we actually try to walk all nested scopes and find their nonlocal assignments to x, and union those into the possible type of x in this scope. This will often be cyclic, in that the type assigned to x in the inner scope may depend on its type in the outer scope, and so if we also introduce a dependency the other direction, we have a cycle. In fact, this very example would be cyclic, and iterating the cycle would build an ever-growing union of integer literals 1, 2, 3, ... until we hit the maximum size for such a union and fall back to int. (This would be accurately reflecting that we aren't trying to determine the maximum number of times g can be called, so we have to account for any number of possible calls.)
I think this is a level of accuracy that other type checkers don't attempt, and I'd rather not get into, certainly not in this PR. Pyright seems content to totally ignore the possibility of a nonlocal write from an inner scope, even in cases where it definitely occurs.
Another approach is that we just enforce that the nonlocal writes in the inner scope(s) respect the declared type (if any) of the name in the outer scope, and in the outer scope we refuse to narrow from that declared type, because we don't know where a call to the inner scope might occur and invalidate our narrowing. So in undeclared cases, like this one, the declared type would be Unknown. Ideally I think what we'd do is union the declared and inferred types, which would result in Unknown | Literal[1] here. This is saying "we know x is nonlocal in a nested scope, and we aren't trying to analyze what might be assigned to it there, therefore we don't know the upper bound on its possible type, but we know the lower bound is Literal[1], since we've seen that assigned in this scope."
In a case where x is declared in the outer scope (e.g. x: int = 1, which we'd also want to add a test for), inner scopes with nonlocal x would require that only things assignable to int can be assigned to x, and the outer scope would take int | Literal[1] as the type of x, which simplifies to int.
I would be inclined to go for the latter approach in this PR, but open to other suggestions/ideas.
Let me know if this doesn't make sense.