PEP 483 – The Theory of Type Hints | peps.python.org (original) (raw)

Author:

Guido van Rossum , Ivan Levkivskyi

Discussions-To:

Python-Ideas list

Status:

Final

Type:

Informational

Topic:

Typing

Created:

19-Dec-2014

Post-History:


Table of Contents

Abstract

This PEP lays out the theory referenced by PEP 484.

Introduction

This document lays out the theory of the new type hinting proposal for Python 3.5. It’s not quite a full proposal or specification because there are many details that need to be worked out, but it lays out the theory without which it is hard to discuss more detailed specifications. We start by recalling basic concepts of type theory; then we explain gradual typing; then we state some general rules and define the new special types (such as Union) that can be used in annotations; and finally we define the approach to generic types and pragmatic aspects of type hinting.

Notational conventions

Background

There are many definitions of the concept of type in the literature. Here we assume that type is a set of values and a set of functions that one can apply to these values.

There are several ways to define a particular type:

It is important for the user to be able to define types in a form that can be understood by type checkers. The goal of this PEP is to propose such a systematic way of defining types for type annotations of variables and functions using PEP 3107 syntax. These annotations can be used to avoid many kind of bugs, for documentation purposes, or maybe even to increase speed of program execution. Here we only focus on avoiding bugs by using a static type checker.

Subtype relationships

A crucial notion for static type checker is the subtype relationship. It arises from the question: If first_var has type first_type, andsecond_var has type second_type, is it safe to assignfirst_var = second_var?

A strong criterion for when it should be safe is:

The relation defined thus is called a subtype relation.

By this definition:

An intuitive example: Every Dog is an Animal, also Doghas more functions, for example it can bark, therefore Dogis a subtype of Animal. Conversely, Animal is not a subtype of Dog.

A more formal example: Integers are subtype of real numbers. Indeed, every integer is of course also a real number, and integers support more operations, such as, e.g., bitwise shifts << and >>:

lucky_number = 3.14 # type: float lucky_number = 42 # Safe lucky_number * 2 # This works lucky_number << 5 # Fails

unlucky_number = 13 # type: int unlucky_number << 5 # This works unlucky_number = 2.72 # Unsafe

Let us also consider a tricky example: If List[int] denotes the type formed by all lists containing only integer numbers, then it is not a subtype of List[float], formed by all lists that contain only real numbers. The first condition of subtyping holds, but appending a real number only works with List[float] so that the second condition fails:

def append_pi(lst: List[float]) -> None: lst += [3.14]

my_list = [1, 3, 5] # type: List[int]

append_pi(my_list) # Naively, this should be safe...

my_list[-1] << 5 # ... but this fails

There are two widespread approaches to declare subtype information to type checker.

In nominal subtyping, the type tree is based on the class tree, i.e., UserID is considered a subtype of int. This approach should be used under control of the type checker, because in Python one can override attributes in an incompatible way:

class Base: answer = '42' # type: str

class Derived(Base): answer = 5 # should be marked as error by type checker

In structural subtyping the subtype relation is deduced from the declared methods, i.e., UserID and int would be considered the same type. While this may occasionally cause confusion, structural subtyping is considered more flexible. We strive to provide support for both approaches, so that structural information can be used in addition to nominal subtyping.

Summary of gradual typing

Gradual typing allows one to annotate only part of a program, thus leverage desirable aspects of both dynamic and static typing.

We define a new relationship, is-consistent-with, which is similar to is-subtype-of, except it is not transitive when the new type Any is involved. (Neither relationship is symmetric.) Assigning a_valueto a_variable is OK if the type of a_value is consistent with the type of a_variable. (Compare this to “… if the type of a_valueis a subtype of the type of a_variable”, which states one of the fundamentals of OO programming.) The is-consistent-with relationship is defined by three rules:

That’s all! See Jeremy Siek’s blog post What is Gradual Typingfor a longer explanation and motivation. Any can be considered a type that has all values and all methods. Combined with the definition of subtyping above, this places Any partially at the top (it has all values) and bottom (it has all methods) of the type hierarchy. Contrast this toobject – it is not consistent with most types (e.g. you can’t use an object() instance where anint is expected). IOW both Any and object mean “any type is allowed” when used to annotate an argument, but only Anycan be passed no matter what type is expected (in essence, Anydeclares a fallback to dynamic typing and shuts up complaints from the static checker).

Here’s an example showing how these rules work out in practice:

Say we have an Employee class, and a subclass Manager:

class Employee: ... class Manager(Employee): ...

Let’s say variable worker is declared with type Employee:

worker = Employee() # type: Employee

Now it’s okay to assign a Manager instance to worker (rule 1):

It’s not okay to assign an Employee instance to a variable declared with type Manager:

boss = Manager() # type: Manager boss = Employee() # Fails static check

However, suppose we have a variable whose type is Any:

something = some_func() # type: Any

Now it’s okay to assign something to worker (rule 2):

Of course it’s also okay to assign worker to something (rule 3), but we didn’t need the concept of consistency for that:

Types vs. Classes

In Python, classes are object factories defined by the class statement, and returned by the type(obj) built-in function. Class is a dynamic, runtime concept.

Type concept is described above, types appear in variable and function type annotations, can be constructed from building blocks described below, and are used by static type checkers.

Every class is a type as discussed above. But it is tricky and error prone to implement a class that exactly represents semantics of a given type, and it is not a goal of PEP 484.The static types described in PEP 484 should not be confused with the runtime classes. Examples:

Typing interface is implemented with classes, i.e., at runtime it is possible to evaluate, e.g., Generic[T].__bases__. But to emphasize the distinction between classes and types the following general rules apply:

Fundamental building blocks

We might add:

Generic types

The fundamental building blocks defined above allow to construct new types in a generic manner. For example, Tuple can take a concrete type floatand make a concrete type Vector = Tuple[float, ...], or it can take another type UserID and make another concrete typeRegistry = Tuple[UserID, ...]. Such semantics is known as generic type constructor, it is similar to semantics of functions, but a function takes a value and returns a value, while generic type constructor takes a type and “returns” a type.

It is common when a particular class or a function behaves in such a type generic manner. Consider two examples:

To allow type annotations in situations from the first example, built-in containers and container abstract base classes are extended with type parameters, so that they behave as generic type constructors. Classes, that behave as generic type constructors are called generic types. Example:

from typing import Iterable

class Task: ...

def work(todo_list: Iterable[Task]) -> None: ...

Here Iterable is a generic type that takes a concrete type Taskand returns a concrete type Iterable[Task].

Functions that behave in the type generic manner (as in second example) are called generic functions. Type annotations of generic functions are allowed by type variables. Their semantics with respect to generic types is somewhat similar to semantics of parameters in functions. But one does not assign concrete types to type variables, it is the task of a static type checker to find their possible values and warn the user if it cannot find. Example:

def take_first(seq: Sequence[T]) -> T: # a generic function return seq[0]

accumulator = 0 # type: int

accumulator += take_first([1, 2, 3]) # Safe, T deduced to be int accumulator += take_first((2.7, 3.5)) # Unsafe

Type variables are used extensively in type annotations, also internal machinery of the type inference in type checkers is typically build on type variables. Therefore, let us consider them in detail.

Type variables

X = TypeVar('X') declares a unique type variable. The name must match the variable name. By default, a type variable ranges over all possible types. Example:

def do_nothing(one_arg: T, other_arg: T) -> None: pass

do_nothing(1, 2) # OK, T is int do_nothing('abc', UserID(42)) # also OK, T is object

Y = TypeVar('Y', t1, t2, ...). Ditto, constrained to t1, etc. Behaves similar to Union[t1, t2, ...]. A constrained type variable ranges only over constrains t1, etc. exactly; subclasses of the constrains are replaced by the most-derived base class among t1, etc. Examples:

Defining and using generic types

Users can declare their classes as generic types using the special building block Generic. The definitionclass MyGeneric(Generic[X, Y, ...]): ... defines a generic typeMyGeneric over type variables X, etc. MyGeneric itself becomes parameterizable, e.g. MyGeneric[int, str, ...] is a specific type with substitutions X -> int, etc. Example:

class CustomQueue(Generic[T]):

def put(self, task: T) -> None:
    ...
def get(self) -> T:
    ...

def communicate(queue: CustomQueue[str]) -> Optional[str]: ...

Classes that derive from generic types become generic. A class can subclass multiple generic types. However, classes derived from specific types returned by generics are not generic. Examples:

class TodoList(Iterable[T], Container[T]): def check(self, item: T) -> None: ...

def check_all(todo: TodoList[T]) -> None: # TodoList is generic ...

class URLList(Iterable[bytes]): def scrape_all(self) -> None: ...

def search(urls: URLList) -> Optional[bytes] # URLList is not generic ...

Subclassing a generic type imposes the subtype relation on the corresponding specific types, so that TodoList[t1] is a subtype of Iterable[t1]in the above example.

Generic types can be specialized (indexed) in several steps. Every type variable could be substituted by a specific type or by another generic type. If Generic appears in the base class list, then it should contain all type variables, and the order of type parameters is determined by the order in which they appear in Generic. Examples:

Table = Dict[int, T] # Table is generic Messages = Table[bytes] # Same as Dict[int, bytes]

class BaseGeneric(Generic[T, S]): ...

class DerivedGeneric(BaseGeneric[int, T]): # DerivedGeneric has one parameter ...

SpecificType = DerivedGeneric[int] # OK

class MyDictView(Generic[S, T, U], Iterable[Tuple[U, T]]): ...

Example = MyDictView[list, int, str] # S -> list, T -> int, U -> str

If a generic type appears in a type annotation with a type variable omitted, it is assumed to be Any. Such form could be used as a fallback to dynamic typing and is allowed for use with issubclassand isinstance. All type information in instances is erased at runtime. Examples:

def count(seq: Sequence) -> int: # Same as Sequence[Any] ...

class FrameworkBase(Generic[S, T]): ...

class UserClass: ...

issubclass(UserClass, FrameworkBase) # This is OK

class Node(Generic[T]): ...

IntNode = Node[int] my_node = IntNode() # at runtime my_node.class is Node # inferred static type of my_node is Node[int]

Covariance and Contravariance

If t2 is a subtype of t1, then a generic type constructor GenType is called:

To better understand this definition, let us make an analogy with ordinary functions. Assume that we have:

def cov(x: float) -> float: return 2*x

def contra(x: float) -> float: return -x

def inv(x: float) -> float: return x*x

If x1 < x2, then always cov(x1) < cov(x2), andcontra(x2) < contra(x1), while nothing could be said about inv. Replacing < with is-subtype-of, and functions with generic type constructor we get examples of covariant, contravariant, and invariant behavior. Let us now consider practical examples:

One of the best examples to illustrate (somewhat counterintuitive) contravariant behavior is the callable type. It is covariant in the return type, but contravariant in the arguments. For two callable types that differ only in the return type, the subtype relationship for the callable types follows that of the return types. Examples:

While for two callable types that differ only in the type of one argument, the subtype relationship for the callable types goes in the opposite direction as for the argument types. Examples:

Yes, you read that right. Indeed, if a function that can calculate the salary for a manager is expected:

def calculate_all(lst: List[Manager], salary: Callable[[Manager], Decimal]): ...

then Callable[[Employee], Decimal] that can calculate a salary for any employee is also acceptable.

The example with Callable shows how to make more precise type annotations for functions: choose the most general type for every argument, and the most specific type for the return value.

It is possible to declare the variance for user defined generic types by using special keywords covariant and contravariant in the definition of type variables used as parameters. Types are invariant by default. Examples:

T = TypeVar('T') T_co = TypeVar('T_co', covariant=True) T_contra = TypeVar('T_contra', contravariant=True)

class LinkedList(Generic[T]): # invariant by default ... def append(self, element: T) -> None: ...

class Box(Generic[T_co]): # this type is declared covariant def init(self, content: T_co) -> None: self._content = content def get_content(self) -> T_co: return self._content

class Sink(Generic[T_contra]): # this type is declared contravariant def send_to_nowhere(self, data: T_contra) -> None: with open(os.devnull, 'w') as devnull: print(data, file=devnull)

Note, that although the variance is defined via type variables, it is not a property of type variables, but a property of generic types. In complex definitions of derived generics, variance _only_determined from type variables used. A complex example:

T_co = TypeVar('T_co', Employee, Manager, covariant=True) T_contra = TypeVar('T_contra', Employee, Manager, contravariant=True)

class Base(Generic[T_contra]): ...

class Derived(Base[T_co]): ...

A type checker finds from the second declaration that Derived[Manager]is a subtype of Derived[Employee], and Derived[t1]is a subtype of Base[t1]. If we denote the is-subtype-of relationship with <, then the full diagram of subtyping for this case will be:

Base[Manager] > Base[Employee] v v Derived[Manager] < Derived[Employee]

so that a type checker will also find that, e.g., Derived[Manager] is a subtype of Base[Employee].

For more information on type variables, generic types, and variance, see PEP 484, the mypy docs on generics, and Wikipedia.

Pragmatics

Some things are irrelevant to the theory but make practical use more convenient. (This is not a full list; I probably missed a few and some are still controversial or not fully specified.)

Predefined generic types and Protocols in typing.py

(See also the typing.py module.)

This document is licensed under the Open Publication License.

References and Footnotes