[Python-3000] Need help completing ABC pep (original) (raw)

Jeffrey Yasskin jyasskin at gmail.com
Sun Apr 22 10:12:33 CEST 2007


On 4/21/07, Guido van Rossum <guido at python.org> wrote:

On 4/21/07, Jeffrey Yasskin <jyasskin at gmail.com> wrote: > PartiallyOrdered: > This ABC defines the 5 inequality operations <, <=, >=, >, and cmp().

Actually, I'm hoping to eliminate cmp() completely. Also, even if it doesn't get eliminated, the existence of cmp() suggests a total ordering.

Ok. It's more efficient when comparing large structures, but it's not absolutely needed. In some languages it can be more convenient too, but that's because you take advantage of their case statements.

> Subclasses of PartiallyOrdered must define one of cmp() or <=.

Actually, Python typically uses < as the single comparison operator that must be defined (e.g. list.sort() uses < exclusively).

With only a partial order, you need both < and == to define the other operators, or you can do it with just <=. A total order allows you to do it with either < or <= alone. sort(), at least in C++, needs at least a "strict weak ordering" (http://www.sgi.com/tech/stl/StrictWeakOrdering.html), which lets you define equivalence but not equality using just <.

> > > The final specification of Set should spell out the invariants, but > > > they're clear enough that the PEP probably doesn't need to. It's odd, > > > though, that you mention optimizations to eq and le but not > > > their semantics. > > > > The semantics are kind of obvious. :-) > > > > Text spelling out the invariants in more detail that I can just paste > > into the PEP is most welcome! > > Bah! ;) > > Set: > Sets should implement an efficient (no worse than O(log N)) > contains operation. Sets are defined by their extensions.

What's an extension?

Oops. It's the collection of objects 'x' for which 'x in set' is true. We can probably just skip that sentence.

> Subclasses may override the following default definitions of eq > and le with more efficient versions, but they must respect the > same semantics: > def eq(a, b): return len(a) == len(b) and all(x in a for x in b) > def le(a, b): return len(a) <= len(b) and all(x in a for x in b)_ _> > ComposableSet: > ... mathematical meaning of set composition: > Where a and b are instances of ComposableSet and op(a,b) is defined: > x in (a | b) ↔ x in a or x in b > x in (a & b) ↔ x in a and x in b > x in (a ^ b) ↔ x in a != x in b > x in (a - b) ↔ x in a and not x in b > > > > ComposableSet's open issues: > > > > Should we just pick a concrete return type (e.g. set)? > > > > > > For set compositions, you should say whether the arguments need to be > > > uniform. (Is (IntSet(1,2,3) | set('a','b','c')) allowed?) If they do, > > > then you don't need a concrete return type; if they don't, then you > > > do, and the default implementations should use it. > > > > You may be misunderstanding what I meant. I certainly don't want to > > constrain the potential return type of the composition operations > > (except that they should themselves be composable sets). The question > > about picking a specific return type was meant as a way out to > > providing a concrete default implementation. I'm still on the fence of > > this, as it could be a big waste of time (I imagine that composing > > Patricia trees can be done more efficiently than iterating over the > > elements). > > Borrowing some syntax from Java generics, which of: > ComposableSet or(ComposableSet a, ComposableSet b); > ComposableSet or(CS a, CS b); > CS or(CS a, CS b); > do you mean to define?

Can't say I understand that. The minimum requirement is that the return type is a subclass of ComposableSet. Some subclasses CS may themselves guarantee that the result is always a subclass of CS, but I don't want to force that. Returning a 'set' in the default implementation satisfies the former but not the latter. This is in itself fine (CS can override or to return a CS). But it could be expensive -- even if the author of CS doesn't care about the return type being a CS itself, they might still care about th efficiency.

I think you want my first option then. How about:

"The default implementation ensures that or, and, sub, and xor do the right thing for any ComposableSet with Hashable elements, but because it iterates over both arguments and returns a built-in set, it may be slow. Subclasses should optimize these operations for types they know about but continue to support heterogenous operations by delegating to ComposableSet.

"This should be rare: If you define a ComposableSet that can hold non-Hashable elements, your users should expect that they can compose sets with identical types, but not necessarily sets with different types, unless, of course, you give them extra guarantees."

> I think they have different implications about > what defaults you need to provide. The second is probably out because > it would mean (a|b|c) would likely break. This gets even more complex > if we consider the types of the contents of the sets... Here's a > possible overloaded signature for or. (Do you already have a > better notation for this?)

I think this is more appropriate for other languages.

Having the compiler check the signature is definitely more appropriate for other languages, or at least a third-party module. Knowing what types we handle as arguments and what that type the result can be is important for any language. My notation is, of course, probably not the best we could come up with. :)

-- Namasté, Jeffrey Yasskin



More information about the Python-3000 mailing list