[Python-Dev] XXX do we need a new policy? (original) (raw)

glyph at divmod.com glyph at divmod.com
Tue Nov 4 11:43:13 CET 2008


On 3 Nov, 11:44 pm, guido at python.org wrote:

On Mon, Nov 3, 2008 at 3:39 PM, Benjamin Peterson <musiccomposition at gmail.com> wrote:

Grepping through Python's sources tells me that we have over 2,000 "XXX" comments.

So, I propose that we adopt a policy similar to Twisted's: All "XXX" comments must have an issue in the tracker and an accompanying link with the source code.

That seems excessively draconian. (...) personally, I don't see XXX comments necessarily as "to be resolved" -- merely as flags for someone perusing the code looking to change it or digging for the cause of some problem to pay special attention.

For what it's worth, in my experience this is one of the least draconian areas of Twisted policy ;-). I'll describe our reasoning for the policy and the effect that it's had.

If a developer working on a feature is inclined to type

# somebody, do something to something else
somebody.doSomething(somethingElse)

then the intent is typically to say "this is a confusing snippet of code, it bears explanation beyond what is clearly readable in the code itself". Documenting the "why" and "what" happens in the docstring, so we only have comments in areas of code where the "how" or the "who" (i.e. "only JP and Glyph understand this") is difficult to understand. In Twisted, this is almost always some wicked busted operating system API, like how you get undocumented errnos out of send(), recv() or select() on some platform.

However, if they're inclined to type

# TODO: somebody should really do more stuff here
somebody.doSomething(somethingElse)

most commonly, what this means is "before I merge this feature, I really want to fix this potential issue so we don't need to deal with it in the future". The policy applies to "XXX" because in my personal experience it's always used as a synonym for "TODO".

My subjective impression is that in 75% of the cases where a comment like this is caught in a code review, the original author of the branch had actually forgotten they had added a comment like this and will eagerly go and fix it, to avoid dealing with the issues that it might raise in a release in the future. I can't recall a case where an author (myself included) was not happy to be alerted to a comment like this.

In a significant minority of cases, although the author has still usually forgotten (if they had remembered they would have filed a ticket preemptively, after all), the task in question is actually too big or too esoteric to bother with in the same branch, so it gets deferred. Filing a ticket is the natural extension of this, so that we don't lose track of the task in question.

The consequence of this policy has been higher quality bugfixes (because contributors don't forget what they're doing), and increased visibility into problems in our codebase(s) (since we use this on Divmod projects too) which directly translates into less duplicate grumbling about the same issue over and over again in pointless comments.

Also, I hear a lot of kvetching about Twisted policy, especially review latency (I hear you guys are familiar with that problem as well), but as I recall nobody has ever complained that this was too much work. When we instituted the policy, as with most of our policies, there is a grandfather clause: we didn't go back and trawl through all of the existing XXX and TODO comments, we just made it so you couldn't add new XXX and TODO comments without filing and linking to a ticket.

I haven't seen "XXX" used as a way to say "pay special attention" because our use of comments is extremely spare. If there's a comment at all that means the code is sufficiently wacky that it warrants special attention. Also, we don't write much code in C, so the preponderance of "sufficiently wacky" code is small ;).

Looking at the Python codebase, though, (specifically the 2.6 release), there are definitely "XXX" comments which do not appear to be actionable tasks by their authors. Guido's certainly right about that: not all XXX comments in Python are "to be resolved". But, a subjective assessment of the first couple of pages of such comments indicates that a majority of them are. A lot of them are cryptic, too. One of the benefits of review scrutiny of comments is a chance for the reviewer to pipe up and say "This comment is incomprehensible. Can you explain it (on a ticket) in a way that someone else might fix it, rather than a note to yourself?" (with the implication being "a note to yourself that nobody, including you six months later, is ever going to understand").

I think that a Twisted reviewer encountering a comment like /* XXX From here until type is allocated, "return NULL" leaks bases! */ would probably say something like "XXX should probably be something more informative, like 'WARNING:' or 'CAREFUL:'". Clearly if XXX is being used in this way there's no point in filing a ticket, and the comment here is perfectly clear; the policy is not intended to cover this case.

In fact it might be good to have a whole special comment convention for notes about the constantly horrific tightrope-walk over lava that is memory-management and object lifecycles in C. If anything, "XXX" is a bit too easy to glaze over in a codebase with >2000 "XXX" comments. But I don't write this type of code too often, so maybe to a more fluent speaker of C "/* XXX memory junk */" is an idiomatic thing to see.

Personally I'd recommend that Python adopt such a policy, but mainly for your own sanity. It's not something that I think would make a huge difference to anyone outside the core development group, except perhaps to generate smaller, easier issues for new contributors to work on for practice. Certainly not something I'm going to push for - my intent here is just to share our experience.



More information about the Python-Dev mailing list