[Python-3000] Need help completing ABC pep (original) (raw)
Guido van Rossum guido at python.org
Sat Apr 21 00:24:26 CEST 2007
- Previous message: [Python-3000] Need help completing ABC pep
- Next message: [Python-3000] Need help completing ABC pep
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On 4/20/07, Jeffrey Yasskin <jyasskin at gmail.com> wrote:
On 4/19/07, Guido van Rossum <guido at python.org> wrote: > I've started a PEP on Abstract Base Classes (ABCs), PEP 3119: > > http://www.python.org/dev/peps/pep-3119/ > > While I'm not ready yet to answer tough questions about this compared > to alternative proposals, I am ready for feedback on the various > open issues sprinkled throughout the current text, especially > concerning decisions regarding exactly which operations to include in > the various ABCs, and also regarding the level of detail required in > the PEP.
I have a bunch of comments that aren't all related to the open issues. I'm more familiar with statically typed languages, so sorry if some of these comments don't make sense in a Python context.
Thanks for the feedback! You're doing fine. :-)
Hashable might want to specify the exception that should be thrown for mutable objects.
Thanks -- done.
I wouldn't have guessed what Finite was from the name, but I don't mind it now that I know. I do prefer 'Sized'.
Sized it is.
A possible name for the contains ABC for sequences and strings is 'Searchable'. On the other hand, an annotation of some sort for the running time of the method might do as well, but I don't know of any prior art for that.
I've added the name suggestion to the open issues; I like it. I'm not sure I want to break new ground specifying running time; it's a tricky business.
On Sets: Do you have an example of a data structure that is a Set but not a ComposableSet?
Take for example a set that is a contiguous range of integers, represented by a (lo, hi) tuple. In many cases the return type of union and symmetric difference cannot be represented as an object of the same type. I don't want to force the implementer of such a set to provide implementations for operations they don't need.
A different POV could be to move the concrete composition operations
back into Set and let the default implementations return built-in
set
instances. I'm not sure how useful that is.
Set should inherit from a PartiallyOrdered ABC that defines the comparison operators.
I guess then there would also be a TotallyOrdered ABC deriving from PartiallyOrdered which adds no operations but does add a constraint. I've added both, and made Set derive from PartiallyOrdered.
The final specification of Set should spell out the invariants, but they're clear enough that the PEP probably doesn't need to. It's odd, though, that you mention optimizations to eq and le but not their semantics.
The semantics are kind of obvious. :-)
Text spelling out the invariants in more detail that I can just paste into the PEP is most welcome!
ComposableSet's open issues: > Should we define an API for creating new instances (e.g. a class method or a fixed constructor signature)?
There are some set types that can't be constructed from arbitrary sequences. For example, Patricia Tries (http://www.haskell.org/ghc/docs/latest/html/libraries/base/Data-IntSet.html) can be used to make very efficient sets of integers, but only integers.
Right. That's why I'm separating ComposableSet from Set.
> Should we just pick a concrete return type (e.g. set)?
For set compositions, you should say whether the arguments need to be uniform. (Is (IntSet(1,2,3) | set('a','b','c')) allowed?) If they do, then you don't need a concrete return type; if they don't, then you do, and the default implementations should use it.
You may be misunderstanding what I meant. I certainly don't want to constrain the potential return type of the composition operations (except that they should themselves be composable sets). The question about picking a specific return type was meant as a way out to providing a concrete default implementation. I'm still on the fence of this, as it could be a big waste of time (I imagine that composing Patricia trees can be done more efficiently than iterating over the elements).
Even this may be a problem if someone writes a TreeSet which requires TotallyOrdered-but-not-Hashable elements. In any case, the ABC should encourage implementors to return an instance of the same type as the arguments when the arguments are of the same type.
Another good argument against having a default concrete implementation that always returns a built-in set.
> Should we add the copy method? Not on ComposableSet, but maybe in a Copyable ABC, which it might inherit from.
I'd rather leave copying APIs out of this completely. There's a separate copying protocol in Python already, you can define copy() for a shallow copy and deepcopy() for a deep copy (copies the elements recursively). This is also tied into pickling (serialization).
MutableSet: > Should we support more operations implemented by the Python 2 set type?
You have groups of equivalent operations (add, update, ior), (discard, differenceupdate, isub), (??, intersectionupdate, iand), and (??, symmetricdifferenceupdate, ixor) It would be nice to make these all uniform. Intersection is hard to support one element at a time, but symmetric difference would work with a toggle() method.
Cool, I've added toggle(). FWIW, I've remoevd the named methods (update etc.) in favor of stretching the accepted args for the operators (ior etc.).
The operations that work a group at a time are likely be more efficient on Sized arguments, and much more efficient on arguments of the same type. How much are you trying to support these kinds of optimizations to implementations?
Haven't thought much about it. Implementations can just override the concrete methods if they have a faster way, and fall back on the concrete base class methods if their faster way doesn't apply for a particular argument.
> Should we unify remove and discard, a la Java (which has a single method returning a boolean indicating whether it was removed or not)?
Yes. :)
Done.
Mappings: Set-like unions, intersections, and differences are useful on maps too, but you usually have to provide a function to say what to do with colliding values. This is probably a different PEP though.
Right. My dream is to unify the set and dict types, solving the problem of how to specify an empty set ({} would do it), but this was rejected by Raymond Hettinger for valid pragmatic reasons.
Sequence: I guess you've already dealt with this, but allowing getitem with integer indices on str objects forces you to use UTF-32 or play interesting games with caching. Most string algorithms are quite happy with (bidirectional? multipass?) iterators, but programmers aren't used to using them.
Yeah, we've had our fill of discussions about alternative implementations. I want the semantics to be those of UTF-32 or UTF-16, and the initial implementation will use that, since these semantics are most familiar to Python users. At some point we may have competing implementations.
Numbers:
Would you mind helping to write a PEP for numbers? I think I'm running out of round tuits just for the collections and the ABC infrastructure.
Haskell might be good to use for inspiration on the numeric classes: http://www.haskell.org/ghc/docs/latest/html/libraries/base/Prelude.html#7. Note that Haskell's "Real" doesn't imply that the number is fractional, just that it's not complex. Some differences I see from Bill Janssen's proposal: * A Haskell Num is constructible from any Integer (python's long). * A Haskell Fractional (including Complex) is constructible from any Rational (which can be constructed from a float) * The operations are spread into more classes. I think that primarily helps Complex and Fixed-point types fit into the hierarchy better. The Haskell folks actually complain that the operations aren't spread into enough classes. They mostly want more group-theory related classes below Num. * The bitwise operators are segregated into a Bits class, but I don't see any benefit to that.
Right. Python has a long tradition of using ints (and longs) for that.
* In a possible use for Cardinal, Integral**Cardinal=>Integral, FractionalIntegral=>Fractional, and FloatingFloating=>Floating.
That can be (and is currently) done dynamically (see below).
* Complex isn't Orderable
Fortress (http://research.sun.com/projects/plrg/fortress.pdf#page=270) has an extremely detailed hierarchy of algebraic traits, and it looks like they plan a similarly detailed hierarchy of numbers, but the numbers aren't yet specified beyond integers and rationals. Cardinal is also useful as the argument to (Sequence * Cardinal).
I think there's little value for this in a dynamically typed language; the implementation can raise ValueError or treat negative numbers as zero (which is what Python currently does).
Index (Ix) is used in Haskell primarily in the form of tuples of Ints for multi-dimensional arrays.
Right, that's where we use it too (especially the NumPy community).
-- --Guido van Rossum (home page: http://www.python.org/~guido/)
- Previous message: [Python-3000] Need help completing ABC pep
- Next message: [Python-3000] Need help completing ABC pep
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]