[Python-Dev] PEP 560 (original) (raw)

Ivan Levkivskyi levkivskyi at gmail.com
Tue Nov 14 15:26:20 EST 2017


After some discussion on python-ideas, see https://mail.python.org/pipermail/python-ideas/2017-September/047220.html, this PEP received positive comments. The updated version that takes into account the comments that appeared in the discussion so far is available at https://www.python.org/dev/peps/pep-0560/

Here I post the full text for convenience:

++++++++++++++++++++++++++

PEP: 560 Title: Core support for typing module and generic types Author: Ivan Levkivskyi <levkivskyi at gmail.com> Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 03-Sep-2017 Python-Version: 3.7 Post-History: 09-Sep-2017

Abstract

Initially PEP 484 was designed in such way that it would not introduce any changes to the core CPython interpreter. Now type hints and the typing module are extensively used by the community, e.g. PEP 526 and PEP 557 extend the usage of type hints, and the backport of typing on PyPI has 1M downloads/month. Therefore, this restriction can be removed. It is proposed to add two special methods __class_getitem__ and __mro_entries__ to the core CPython for better support of generic types.

Rationale

The restriction to not modify the core CPython interpreter led to some design decisions that became questionable when the typing module started to be widely used. There are three main points of concern: performance of the typing module, metaclass conflicts, and the large number of hacks currently used in typing.

Performance

The typing module is one of the heaviest and slowest modules in the standard library even with all the optimizations made. Mainly this is because of subscripted generic types (see PEP 484 for definition of terms used in this PEP) are class objects (see also [1]_). The three main ways how the performance can be improved with the help of the proposed special methods:

Metaclass conflicts

All generic types are instances of GenericMeta, so if a user uses a custom metaclass, then it is hard to make a corresponding class generic. This is particularly hard for library classes that a user doesn't control. A workaround is to always mix-in GenericMeta::

class AdHocMeta(GenericMeta, LibraryMeta): pass

class UserClass(LibraryBase, Generic[T], metaclass=AdHocMeta): ...

but this is not always practical or even possible. With the help of the proposed special attributes the GenericMeta metaclass will not be needed.

Hacks and bugs that will be removed by this proposal

Specification

__class_getitem__

The idea of __class_getitem__ is simple: it is an exact analog of __getitem__ with an exception that it is called on a class that defines it, not on its instances. This allows us to avoid GenericMeta.__getitem__ for things like Iterable[int]. The __class_getitem__ is automatically a class method and does not require @classmethod decorator (similar to __init_subclass__) and is inherited like normal attributes. For example::

class MyList: def getitem(self, index): return index + 1 def class_getitem(cls, item): return f"{cls.name}[{item.name}]"

class MyOtherList(MyList): pass

assert MyList()[0] == 1 assert MyList[int] == "MyList[int]"

assert MyOtherList()[0] == 1 assert MyOtherList[int] == "MyOtherList[int]"

Note that this method is used as a fallback, so if a metaclass defines __getitem__, then that will have the priority.

__mro_entries__

If an object that is not a class object appears in the bases of a class definition, then __mro_entries__ is searched on it. If found, it is called with the original tuple of bases as an argument. The result of the call must be a tuple, that is unpacked in the bases classes in place of this object. (If the tuple is empty, this means that the original bases is simply discarded.) Using the method API instead of just an attribute is necessary to avoid inconsistent MRO errors, and perform other manipulations that are currently done by GenericMeta.__new__. After creating the class, the original bases are saved in __orig_bases__ (currently this is also done by the metaclass). For example::

class GenericAlias: def init(self, origin, item): self.origin = origin self.item = item def mro_entries(self, bases): return (self.origin,)

class NewList: def class_getitem(cls, item): return GenericAlias(cls, item)

class Tokens(NewList[int]): ...

assert Tokens.bases == (NewList,) assert Tokens.orig_bases == (NewList[int],) assert Tokens.mro == (Tokens, NewList, object)

NOTE: These two method names are reserved for use by the typing module and the generic types machinery, and any other use is discouraged. The reference implementation (with tests) can be found in [4], and the proposal was originally posted and discussed on the typing tracker, see [5].

Dynamic class creation and types.resolve_bases

type.__new__ will not perform any MRO entry resolution. So that a direct call type('Tokens', (List[int],), {}) will fail. This is done for performance reasons and to minimize the number of implicit transformations. Instead, a helper function resolve_bases will be added to the types module to allow an explicit __mro_entries__ resolution in the context of dynamic class creation. Correspondingly, types.new_class will be updated to reflect the new class creation steps while maintaining the backwards compatibility::

def new_class(name, bases=(), kwds=None, exec_body=None): resolved_bases = resolve_bases(bases) # This step is added meta, ns, kwds = prepare_class(name, resolved_bases, kwds) if exec_body is not None: exec_body(ns) cls = meta(name, resolved_bases, ns, **kwds) cls.orig_bases = bases # This step is added return cls

Backwards compatibility and impact on users who don't use typing

This proposal may break code that currently uses the names __class_getitem__ and __mro_entries__. (But the language reference explicitly reserves all undocumented dunder names, and allows "breakage without warning"; see [6]_.)

This proposal will support almost complete backwards compatibility with the current public generic types API; moreover the typing module is still provisional. The only two exceptions are that currently issubclass(List[int], List) returns True, while with this proposal it will raise TypeError, and repr() of unsubscripted user-defined generics cannot be tweaked and will coincide with repr() of normal (non-generic) classes.

With the reference implementation I measured negligible performance effects (under 1% on a micro-benchmark) for regular (non-generic) classes. At the same time performance of generics is significantly improved:

References

.. [1] Discussion following Mark Shannon's presentation at Language Summit (https://github.com/python/typing/issues/432)

.. [2] Pull Request to implement shared generic ABC caches (merged) (https://github.com/python/typing/pull/383)

.. [3] An old bug with setting/accessing attributes on generic types (https://github.com/python/typing/issues/392)

.. [4] The reference implementation (https://github.com/ilevkivskyi/cpython/pull/2/files, https://github.com/ilevkivskyi/cpython/tree/new-typing)

.. [5] Original proposal (https://github.com/python/typing/issues/468)

.. [6] Reserved classes of identifiers ( https://docs.python.org/3/reference/lexical_analysis.html#reserved-classes-of-identifiers )

Copyright

This document has been placed in the public domain.

.. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End: -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/python-dev/attachments/20171114/e4317a41/attachment.html>



More information about the Python-Dev mailing list