[Python-Dev] perplexed by mro (original) (raw)

Guido van Rossum guido@python.org
Thu, 03 Oct 2002 18:03:45 -0400


I'm trying to wrap my head around the mro computation in 2.2.

Thanks for doing this! You've stumbled upon a place where I'm no expert, and where my intuition about the equivalence between algorithms turned out to be wrong.

First issue: PEP 253 and the descrintro document describe an algorithm, typeobject.c implements a rather different one. Descrintro mentions that.

The algorithm described by descrintro, PEP253: "Using the classic lookup rule, construct the list of classes that would be searched, including duplicates. Now for each class that occurs in the list multiple times, remove all occurrences except for the last." which, if I understand correctly, means : do a left-right dfs of the class hierarchy and then discard all but the last occurence of each class. Now descrintro says that the result of this and the real algorithm in typeobject are the same, unless "two given base classes occur in a different order in the inheritance list of two different derived classes, and those derived classes are both inherited by yet another class" But here is an example where the criterium does not apply but still the algorithms' results diverge: >>> cpl.ex5(globals()) class a(object): pass class b(object): pass class c(object): pass class x(a): pass class y(a): pass class z(x,b,y,c): pass >>> cpl.names(z.mro) # just the class names, real 2.2 mro algo ['z', 'x', 'y', 'a', 'b', 'c', 'object'] >>> cpl.names(cpl.keeplast(cpl.lrdfs(z))) # descrintro simple algo ['z', 'x', 'b', 'y', 'a', 'c', 'object']

I wanted to say that the naive algorithm is "better" than the 2.2 algo here because it doesn't reverse the order of b and y given by z's bases. But let's have a look at an example.

Suppose there's a method m that's defined in object, and overridden by class y as well as in b. What is z.m? The 2.2 algo says y.m, the naive algorithm says b.m. Hard to decide which is better. Now let's assume m is also overridden in a, so that y.m overrides a.m overrides object.m. Now if we searched b before y (naive algo), it is arguably strange if the 'super' call chain went from y.m via b.m to a.m.

But I'm not sure how to formalize this intuition.

I was not able to find a new rule. My intuition is that when the result of the real mro has something to do with the dfs, it is sure from the beginning of it not the end, so the relation of the two algos is unclear.

In general I have an hard time trying to predict the result of the real mro algo. Does it a have a list of constraints it enforces/preserves, of the qualities it has?

The book "Putting Metaclasses To Work" that I always quote about this has the algorithm and the rules it tries to enforce. I don't have the book handy here so I can't quote it. I'm pretty sure that I implemented their merge algo correctly, EXCEPT that I don't raise an error or what they call "serious order conflicts". There are no serious order conflicts in this example though.

I have tried to compare it to the algorithms used by Dylan and CLOS and others, described in this paper

http://www.webcom.com/haahr/dylan/linearization-oopsla96.html they are described in terms such as quality criteria and constraints they preserve.

Wish I could spend the time to study and understand all that.

Let's consider this example:

>>> cpl.ex9(globals()) class a(object): pass class b(object): pass class c(object): pass class d(object): pass class e(object): pass class k1(a,b,c): pass class k2(d,b,e): pass class k3(d,a): pass class z(k1,k2,k3): pass >>> cpl.names(z.mro) ['z', 'k1', 'k3', 'a', 'k2', 'd', 'b', 'c', 'e', 'object']

I think this one does have an order conflict, and in that case all bets are off about the book's algo. If you look at k1 and k2, clearly d must come after a; yet k3 wants it after a.

the thing that leaves me most perplexed is that k3 appears before k2. All other algortihms I know would put k2 before k3. E.g. C3 (see paper) ...

>>> cpl.names(cpl.c3cpl(z)) ['z', 'k1', 'k2', 'k3', 'd', 'a', 'b', 'c', 'e', 'object'] or the descrintro simple algo: >>> cpl.names(cpl.keeplast(cpl.lrdfs(z))) ['z', 'k1', 'c', 'k2', 'b', 'e', 'k3', 'd', 'a', 'object'] This means that it does not preserve the local precedence lists (i.e. the inheritance list), it is not monotonic* because otherwise it would put d before a, it seems to preserve "some" aspects of the dfs traversal a,d vs d,a but still it puts k3 before k2. The other hypothesis, is that in this case, the algorithm should fail, because it cannot merge the result of mro(k1) merged with mro(k2) with mro(k3), because the latter is inconsistent with what obtained so far?

Yes. I've so far had a "don't do that then" attitude rather than a "detect all errors" attitude here -- mostly because the order conflict detection code is hairy to write in C.

>>> l=cpl.names(k1.mro) >>> l ['k1', 'a', 'b', 'c', 'object'] >>> r=cpl.names(k2.mro) >>> cpl.conservativemerge(l,r) ... >>> l ['k1', 'a', 'k2', 'd', 'b', 'c', 'e', 'object']

vs. >>> cpl.names(k3.mro) ['k3', 'd', 'a', 'object'] a<d is incosistent with d<a But then this inconsistency rule is different/subtler than the one sketched in descrintro. So can someone (Guido?) summarize the qualities of this rule? is there a paper that describes it? or should I buy the book?

Buy the book. Last time I looked it was on sale (used) at Amazon.com.

regards.

* the mro of a subclass is an extension without re-ordering of the mros of the superclasses.

(I don't think your second post has anything else to which I need to respond, right?)

--Guido van Rossum (home page: http://www.python.org/~guido/)