Indexing TypedDict with a non-literal variable (original) (raw)

April 19, 2025, 10:49pm 1

The typing spec is less clear than it could be when it comes to indexing a TypedDict with a variable whose type is not a literal string. Here’s what it says:

A key that is not a literal should generally be rejected, since its value is unknown during type checking, and thus can cause some of the above violations.

The use of the word “generally” seems to imply that there are cases where this rule does not need to be enforced, but it doesn’t explain what exceptions exist. Does this allow individual type checkers to decide if and when to enforce this rule?

Mypy (which was the reference implementation for PEP 589), is stricter in this regard than pyright. Pyright’s current behavior was implemented based on feedback from both pyright and mypy users. There are a number of common idioms in Python that are safe and lead to false positive errors if this rule is enforced. Here are a few examples:

def func1(d: TD):
    for x in d:
        print(f"{x}: {d[x]}") # Mypy error: TypedDict key must be a string literal

def func2(d: TD, s: str):
    if s in d:
        print(d[s]) # Mypy error: TypedDict key must be a string literal

This enforcement extends to methods other than __getitem__, __setitem__, and __delitem__. It also affects the synthesized methods pop and setdefault.

The behavioral difference between mypy and pyright hasn’t been an issue before now, but a recent PR proposes to add test cases to the conformance test suite that interpret the current language in the spec in a strict manner. If this interpretation is codified in the conformance tests, I’ll feel compelled to make pyright adopt this stricter interpretation. I can say from experience that this change will be unpopular among pyright users.

Before going down that road, I wanted to get input from the typing community and see if there is consensus on this topic — and to see if we can crisp up the language in the spec to eliminate ambiguity.

I think there are two options:

Mandate Strictness: Modify the spec to simply remove the word “generally”. (Can anyone think of a case where it should be exempted other than Any?) The wording should also probably be more precise than “not a literal” since LiteralString is a literal but should not be exempt from this stricter interpretation of the rule.
Optional Enforcement: Modify the spec to allow compliant type checkers to choose whether to enforce this rule by replacing “generally should” with “may”. This would allow both mypy and pyright to retain their current behaviors.

My personal preference is option 2 because I think pyright’s current behavior strikes a good balance between identifying common bugs and allowing common use cases without incurring false positive errors. However, if there is consensus around option 1, then I will honor that decision and modify pyright accordingly.

tusharc (tushar c.) April 19, 2025, 11:31pm 2

As a user, I find pyright’s behavior to be extremely useful. It is pragmatic and lets me be more expressive in code. I support option 2 because it lets type checkers be more useful without compromising safety.

I think there’s a line between strictness and optional enforcement that could reach something closer to what pyright provides here, without defining it as “leave this to diverge”

The way to do this would be to say the type of the value depends on how much knowledge a type checker has, this corresponds pretty closely to the approach being used for other in progress work while reducing false positives

If it’s a Literal that’s known to be a key, and the specific literal is known, it’s the corresponding value.
If the literal is known to be a key, and the specific literal is not known, it’s the union of possible values
If it’s possibly a string of unknown value, it’s the union of possible values (Index Errors are not modeled in the type system more precisely than a possible Never, which evaporates in a Union)
If it’s not a string, or it’s a string literal known not to be in a total typed dict, it’s a true positive usage error.

hauntsaninja (Shantanu Jain) April 20, 2025, 2:56am 4

Option 2 sounds right to me

Andrej730 (Andrej) April 20, 2025, 6:21am 5

I think we probably should go for option 2 as it’s more realistic and kind of follows the acknowledgement in the following paragraph that tracking existence of keys is hard and it’s probably even harder if key is a str and not a literal.

Type checkers may allow reading an item using d['x'] even if the key 'x' is not required, instead of requiring the use of d.get('x') or an explicit 'x' in d check. The rationale is that tracking the existence of keys is difficult to implement in full generality, and that disallowing this could require many changes to existing code.

Though I’d love to have a strict option as it may help to prevent some code errors and force users who agree with this to use more literal types.
But I see how this can be annoying as there’s no direct way to statically check if some TypedDict has a key x besides manually creating a container with available keys and maintain it in parallel to the typed dict class.

# pyright: strict
from typing import TypedDict


class TD(TypedDict):
    start_date: str
    period: str
    end_date: str

# Sadly, user would have to maintain it manually.
TD_KEYS = frozenset(("start_date", "period", "end_date"))

def test(td: TD, key: str):
    if key not in TD.__annotations__:
        reveal_type(key)  # str
    if key not in TD_KEYS:
        raise Exception("Invalid key")
    # Good to go.
    reveal_type(key)  # Literal['start_date', 'period', 'end_date']
    print(td[key])

antonagestam (Anton Agestam) April 20, 2025, 9:46am 6

Does option two allow some type unsafety with regards to extra keys, possibly introduced by a subtype? I have no opinion whether it’s justified or not, but it might be valuable to discuss. Consider the linked example, will option two remove the current true positive?

Andrej730 (Andrej) April 20, 2025, 11:56am 7

Option 2 allows each type checker to decide the level of safety for that case individually. So option 2 does not necessarily removes the true positive, it just makes it not mandatory to all type checkers.

But yes, if type checker decide not to reject str, then it does remove the true positive for this case. See pyright-play:

Just to expand on this, with introduction of PEP 728: TypedDict with Typed Extra Items, if type checkers adapt it, it will be much easier to narow str to individual literals for a closed typed dict and rejecting non-literal keys may seem more appealing.

jorenham (Joren Hammudoglu) April 25, 2025, 4:15pm 8

I don’t think that option 1 would resolve this false positive, because key is already inferred as a Literal["b"], and o[key] is inferred as Never (i.e. assignable to anything):

class A(TypedDict):
    a: int

class B(A):
    b: str


def fn(o: A) -> int:
    key = "b"
    if key in o:
        reveal_type(key)  # Type of "key" is "Literal['b']"
        reveal_type(o[key])  # Type of "o[key]" is "Never"
        return o[key]
    return o["a"]

This isn’t so much a false positive as just the result of lying to the typechecker. This is bad type information for the behavior you’re expressing here. If the interior of the function relies on knowledge that it may receive something other than A, the annotation for what it expects should reflect that.

Code sample in pyright playground

from typing import TypedDict


class A(TypedDict):
    a: int

class B(A):
    b: str


def fn(o: A | B) -> int:
    key = "b"
    if key in o:
        reveal_type(key)  # Type of "key" is "Literal['b']"
        reveal_type(o[key])  # Type of "o[key]" is "str"
        return o[key]  # proper error for mismatched return.
    return o["a"]


fn(B(
    a=1,
    b="no longer oops",
))

Your constructed case is basically nonsense, because A doesn’t define the extra key, but you’re relying on what happens when you access it after confirming it is present.

If the intent is instead to not allow extra keys, you shouldn’t be accessing stuff without a type known, and should close A:

Code sample in pyright playground

from typing import TypedDict


class A(TypedDict, closed=True):
    a: int

class B(A):  # proper error here
    b: str


def fn(o: A) -> int:
    key = "b"
    if key in o:
        reveal_type(key)  # Type of "key" is "Literal['b']"
        reveal_type(o[key])  # Type of "o[key]" is "Never"
        return o[key]
    return o["a"]


fn(B(  # and proper error here
    a=1,
    b="no longer oops",
))

But i can’t exactly say this is your intent because you’re intentionally accessing something you’ve declared no knowledge of to the type system.

Liz (Elizabeth King) April 25, 2025, 6:14pm 10

I’d go with option 1 here. There’s no reason why someone who has specified they are using a typed dict should be indexing with non-literals.

If you go with number 2, Pyright’s behavior should still be amended. The above discussion missed the actual issue here, but if pyright is going to allow accessing the value when determining it to be present by a key that has no associated type declared, but is known to exist, it should not be inferring a type of Never. There’s clearly a runtime value, and the most that can be known about is is object since this isn’t a gradual type.

jorenham (Joren Hammudoglu) April 25, 2025, 8:08pm 11

To avoid any confusion; who are you addressing here?

@jorenham I replied directly to you using the reply button, and modified your examples. I would have thought it was clear I was responding to you. Discourse removed the reply reference because it was a direct reply to the previous post. I also can’t make it any clearer by quoting your post, without only partially quoting it as discourse strips quoting the most recent post unless you avoid quoting the whole thing, even if the whole thing is what is being responded to.

@Liz pointed out that the root issue is you’re relying on pyright doing something demonstrably incorrect in your example, but either way, the example is nonsensical, and something is clearly wrong that pyright allows this.

erictraut (Eric Traut) April 26, 2025, 7:32pm 13

@Liz, I agree with your analysis. The code in Joren’s example demonstrates a bug in pyright’s type narrowing logic. I’ve created this bug report to track the issue.

jorenham (Joren Hammudoglu) April 26, 2025, 11:07pm 14

Ah ok, that explains the confusion then. My post was a reaction to Indexing TypedDict with a non-literal variable - #7 by Andrej730. The example was just a slight modification of the example from that post (see the playground link). So I was basically trying to show the same thing as you also noted, @mikeshardmind.