[Python-checkins] r54871 - peps/trunk/pep-3119.txt (original) (raw)

guido.van.rossum python-checkins at python.org
Thu Apr 19 02:21:53 CEST 2007


Author: guido.van.rossum Date: Thu Apr 19 02:21:49 2007 New Revision: 54871

Modified: peps/trunk/pep-3119.txt Log: Add lots of text. Still far from complete.

Modified: peps/trunk/pep-3119.txt

--- peps/trunk/pep-3119.txt (original) +++ peps/trunk/pep-3119.txt Thu Apr 19 02:21:49 2007 @@ -7,7 +7,7 @@ Type: Standards Track Content-Type: text/x-rst Created: 18-Apr-2007 -Post-History: 18-Apr-2007 +Post-History: Not yet posted Abstract @@ -145,16 +145,266 @@ The exception raised when attempting to instantiate an abstract class. It derives from TypeError. +A possible implementation would add an attribute +__abstractmethod__ to any method declared with +@abstractmethod, and add the names of all such abstract methods to +a class attribute named __abstractmethods__. Then the +Abstract.__new__() method would raise an exception if any abstract +methods exist on the class being instantiated. For details see [2]_. + ABCs for Containers and Iterators

-XXX define: Iterable, Iterator, Sizeable (or Lengthy?), various Sets, -Mappings, Sequences. Also Hashable so we can define Immutable (which -adds to "read-only", does not subtract from "mutable", nor does -"mutable" add to "hashable" -- it adds to "read-only"). Also defines -how set, frozenset, list, tuple, dict, bytes and str derive from -these. +The collections module will define ABCs necessary and sufficient to +work with sets, mappings, sequences, and some helper types such as +iterators and dictionary views. + +The ABCs provide implementations of their abstract methods that are +technically valid but fairly useless; e.g. __hash__ returns 0, and +__iter__ returns an empty iterator. In general, the abstract +methods represent the behavior of an empty container of the indicated +type. + +Some ABCs also provide concrete (i.e. non-abstract) methods; for +example, the Iterator class has an __iter__ method returning +itself, fulfilling an important invariant of iterators (which in +Python 2 has to be implemented anew by each iterator class). + +No ABCs override __init__, __new__, __str__ or +__repr__. + +XXX define how set, frozenset, list, tuple, dict, bytes and str derive +from these. + + +One Trick Ponies +'''''''''''''''' + +These abstract classes represent single methods like __iter__ or +__len__. + +Hashable + The base class for classes defining __hash__. Its abstract + __hash__ method always returns 0, which is a valid (albeit + inefficient) implementation. Note: being an instance of this + class does not imply that an object is immutable; e.g. a tuple + containing a list as a member is not immutable; its __hash__ + method raises an exception. + +Iterable + The base class for classes defining __iter__. Its abstract + __iter__ method returns an empty iterator. + +Iterator + The base class for classes defining __next__. This derives + from Iterable. Its abstract __next__ method raises + StopIteration. Its __iter__ method returns self, and is + not abstract. + +Lengthy + The base class for classes defining __len__. Its abstract + __len__ method returns 0. (The name is perhaps too cute; but + the only alternatives I've come up with so far are Sizeable, + which suffers from the same affliction, and Finite, which + somehow is associated with numbers instead of sets in my mind.) + +Container + The base class for classes defining __contains__`. Its abstract + __contains__method returnsFalse. Note: strictly + speaking, there are three variants of this method's semantics. + The first one is for sets and mappings, which is fast: O(1) or + O(log N). The second one is for membership checking on sequences, + which is slow: O(N). The third one is for subsequence checking on + (character or byte) strings, which is also slow: O(N). Would it + make sense to distinguish these? The signature of the third + variant is different, since it takes a sequence (typically of the + same type as the method's target) intead of an element. For now, + I'm using the same type for all three. + + +Sets +'''' + +These abstract classes represent various stages of "set-ness". + +Set+ This is a finite, iterable container, i.e. a subclass of + Lengthy, IterableandContainer. Not every subset of + those three classes is a set though! Sets have the additional + property (though it is not expressed in code) that each element + occurs only once (as can be determined by iteration), and in + addition sets implement rich comparisons defined as + subclass/superclass tests. + + Sets with different implementations can be compared safely, + efficiently and correctly. Because Setderives from + Lengthy, __eq__takes a shortcut and returnsFalse+ immediately if two sets of unequal length are compared. + Similarly,__le__returnsFalseimmediately if the first + set has more members than the second set. Note that set inclusion + implements only a partial ordering; e.g. {1, 2} and {1, 3} are not + ordered (all three of<, ==and>returnFalsefor + these arguments). Sets cannot be ordered relative to mappings or + sequences, but they can be compared for equality (and then they + always compare unequal). + + XXX Should we also implement theissubsetandissuperset+ methods found on the set type in Python 2 (which are apparently + just aliases for__le__and__ge__)? + + XXX Should this class also implement union, intersection, + symmetric and asymmetric difference and/or the corresponding + operators? The problem with those (unlike the comparison + operators) is what should be the type of the return value. I'm + tentatively leaving these out -- if you need them, you can test + for a Setinstance that implements e.g.__and__. Some + alternatives: make these abstract methods (even though the + semantics apart from the type are well-defined); or make them + concrete methods that return a specific concrete set type; or make + them concrete methods that assume the class constructor always + accepts an iterable of elements; or add a new class method that + accepts an iterable of elements and that creates a new instance. + (I originally considered a "view" alternative, but the problem is + that computing len(a&b)requires iterating overaor + b, and that pretty much kills the idea.) + +HashableSet+ This is a subclass of bothSetandHashable. It + implements a concrete __hash__ method that subclasses should + not override; or if they do, the subclass should compute the same + hash value. This is so that sets with different implementations + still hash to the same value, so they can be used interchangeably + as dictionary keys. (A similar constraint exists on the hash + values for different types of numbers and strings.) + +MutableSet+ + This is a subclass ofSetimplementing additional operations + to add and remove elements. The supported methods have the + semantics known from thesettype in Python 2: + + .add(x)+ Abstract method that adds the elementx, if it isn't + already in the set. + + .remove(x)+ + Abstract method that removes the elementx; raises + KeyErrorifxis not in the set. + + .discard(x)+ Concrete method that removes the elementxif it is + a member of the set; implemented using__contains__+ andremove. + + .clear()+ Abstract method that empties the set. (Making this concrete + would just add a slow, cumbersome default implementation.) + + XXX Should we support all the operations implemented by the Python + 2set type? I.e. union, update, __or__, __ror__, __ior__, + intersection, intersection_update, __and__, __rand__, __iand__, + difference, difference_update, __xor__, __rxor__, __ixor__, + symmetric_difference, symmetric_difference_update, __sub__, + __rsub__, __isub__. Note that in Python 2, a.update(b) is not + exactly the same as a |= b, since update() takes any iterable for + an argument, while |= requires another set; similar for the other + operators. + + +Mappings +'''''''' + +These abstract classes represent various stages of mapping-ness. + +XXX Do we need BasicMapping and IterableMapping? Perhaps we should +just start with Mapping. + +BasicMapping+ A subclass ofContainerdefining the following methods: + + .__getitem__(key)+ Abstract method that returns the value corresponding to + key, or raises KeyError. The implementation always + raises KeyError. + + .get(key, default=None)+ Concrete method returningself[key]if this does not raise + KeyError, and the defaultvalue if it does. + + .__contains__()+ Concrete method returningTrueifself[key]does not + raiseKeyError, and False if it does. + + +IterableMapping+ A subclass ofBasicMappingandIterable. It defines no + new methods. Iterating over such an object should return all the + valid keys (i.e. those keys for which .__getitem__() returns a + value), once each, and nothing else. It is possible that the + iteration never ends. + +Mapping+ A subclass ofIterableMappingandLengthy. It defines + concrete methods __eq__, keys, items, values. The + lengh of such an object should equal to the number of elements + returned by iterating over the object until the end of the + iterator is reached. Two mappings, even with different + implementations, can be compared for equality, and are considered + equal if and only iff their items compare equal when converted to + sets. The keys, itemsandvaluesmethods return + views;keysanditemsreturnSetviews,values+ returns aContainer view. The following invariant should + hold: m.items() == set(zip(m.keys(), m.values())). + +HashableMapping+ A subclass ofMappingandHashable. The values should be + instances of Hashable. The concrete __hash__method + should be equal tohash(m.items()). + +MutableMapping+ A subclass ofMappingthat also implements some standard + mutating methods. At least__setitem__, __delitem__, + clear, update. XXX Also pop, popitem, setdefault? + +XXX Should probably say something about mapping view types, too. + +Sequences +''''''''' + +These abstract classes represent various stages of sequence-ness. + +Sequence+ A subclass ofIterable, Lengthy, Container. It + defines a new abstract method __getitem__that has a + complicated signature: when called with an integer, it returns an + element of the sequence or raisesIndexError; when called with + a sliceobject, it returns anotherSequence. The concrete + __iter__method iterates over the elements using + __getitem__with integer arguments 0, 1, and so on, until + IndexErroris raised. The length should be equal to the + number of values returned by the iterator. + + XXX Other candidate methods, which can all have default concrete + implementations that only depend on__len__and + __getitem__ with an integer argument: __reversed__, index, + count, __add__, __mul__, __eq__, __lt__, __le__. + +HashableSequence+ A subclass ofSequenceandHashable. The concrete + __hash__ method should implements the hashing algorithms used + by tuples in Python 2. + +MutableSequence+ A subclass ofSequenceadding some standard mutating methods. + Abstract mutating methods:__setitem__(for integer indices as + well as slices),__delitem__(ditto),insert, add, + reverse. Concrete mutating methods: extend, pop, + remove. Note: this does not define sort()-- that is only + required to exist on genuinelist`` instances. + + XXX What about += and *=? ABCs for Numbers @@ -168,10 +418,11 @@ Guidelines for Writing ABCs

-XXX E.g. use @abstractmethod and Abstract base class; define abstract +XXX Use @abstractmethod and Abstract base class; define abstract methods that could be useful as an end point when called via a super -chain; keep abstract classes small, one per use case instead of one -per concept. +chain; define concrete methods that are very simple permutations of +abstract methods (e.g. Mapping.get); keep abstract classes small, one +per use case instead of one per concept.

References



More information about the Python-checkins mailing list