[Python-Dev] Community buildbots and Python release quality metrics (original) (raw)

glyph at divmod.com glyph at divmod.com
Thu Jun 26 17🔞55 CEST 2008


Today on planetpython.org, Doug Hellman announced the June issue of Python magazine. The cover story this month is about Pybots, "the fantastic automation system that has been put in place to make sure new releases of Python software are as robust and stable as possible".

Last week, there was a "beta" release of Python which, according to the community buildbots, cannot run any existing python software. Normally I'd be complaining here about Twisted, but in fact Twisted is doing relatively well right now; only 80 failing tests. Django apparently cannot even be imported.

The community buildbots have been in a broken state for months now1. I've been restraining myself from whinging about this, but now that it's getting close to release, it's time to get these into shape, or it's time to get rid of them.

There are a few separate issues here. One is, to what extent should Python actually be compatible between releases? There's not much point in saying "this release is compatible with python 2.5" if all python 2.5 programs need to be modified in order to run on python 2.6. Of course, it's quite likely that there are some 3rd-party Python programs that continue to work, but python.org provides links to 2 projects that don't and none that do.

Another way to phrase this question is, "whose responsibility is it to make Python 2.5 programs run on Python 2.6"? Or, "what happens when the core team finds out that a change they have made has broken some python software 'in the wild'"?

Here are a couple of ways to answer these questions:

  1. It's the python core team's responsibility. There should be Python buildbots for lots of different projects and they should all be green before the code can be considered [one of: allowed in trunk, alpha, beta, RC].
  2. It's the python core team's responsibility as long as the major projects are using public APIs. The code should not be considered [alpha, beta, RC] if there are any known breakages in public APIs.
  3. It's only the python core team's responsibility to notify projects of incompatible changes of which they are aware, which may occur in public or private APIs. Major projects with breakages will be given a grace period before a [beta, RC, final] to release a forward-compatible version.
  4. The status quo: every new Python release can (and will, at least speaking in terms of 2.6) break lots of popular software, with no warning aside from the beta period. There are no guarantees.

For obvious reasons, I'd prefer #1, but any of the above is better than #4. I don't think #4 is an intentional choice, either; I think it's the result of there being no clear consequence for community buildbots failing. Or, for that matter, of core buildbots failing. Investigating the history is tricky (I see a bunch of emails from Barry here, but I'm not sure which is the final state) but I believe the beta went out with a bunch of buildbots offline or failing.

A related but independent issue is: should the community buildbots continue to exist? An automated test is only as good as the consequences of its failure. As it stands, the community buildbots seem to be nothing but an embarrassment. I don't mean this to slam Grig, or the Python team - I mean, literally, if there is no consequence of their failure then the only use I can see for the data the buildbots currently provide (i.e. months and months of failures) is to embarrass the Python core developers. python.org's purpose should not be to provide negative press for Python ;), so if this continues to be the case, they probably shouldn't be linked. This isn't just an issue with changes to Python breaking stuff; the pybots' configuration is apparently not well maintained, since the buildbots which say "2.5" aren't passing either, and that's not a problem with the code, it's just that the slaves are actually running against trunk at HEAD.

Ultimately these tools only exist to ensure the quality of Python releases. The really critical question here is, what does it mean to have a Python release that is high-quality? There are some obvious things: it shouldn't crash, it should conform to its own documentation. Personally I think "it passes all of its own tests" and "it runs existing Python code" are important metrics too, but the most important thing I'm trying to get across here is that there should be a clear understanding of which goals the release / QA / buildbot / test process is trying to accomplish. For me, personally, it really needs to be clear when I can say "you guys screwed up and broke stuff", and when I just have to suck it up and deal with the consequences of a new version of Python in Twisted.

It's definitely bad for all of us if the result is that new releases of Python just break everything. Users don't care where the responsibility lies, they just know that stuff doesn't work, so we should decide on a process which allows Twisted (and other popular projects, like Django, Plone, pytz, PyGTK, pylons, ...) to (mostly) run when new versions of Python are released.



More information about the Python-Dev mailing list