[Python-ideas] Using only patches for pulling changes in hg.python.org (original) (raw)

Stephen J. Turnbull stephen at xemacs.org
Thu Jul 8 01:22:00 CEST 2010


Georg Brandl writes:

Am 06.07.2010 07:16, schrieb Stephen J. Turnbull:

Georg Brandl writes:

Am 04.07.2010 17:26, schrieb Antoine Pitrou:

On Sun, 04 Jul 2010 15:46:53 +0200 Dirkjan Ochtman <dirkjan at ochtman.nl> wrote:

Fourth, one-patch-per-issue is too restrictive. Small commits are useful because they're way easier to review. Concatenate several small commits leading up to a single issue fix into a single patch and it gets much harder to read.

I don't agree with that. The commits obviously won't be independent because they will be motivated by each other (or even dependent on each other), therefore you have to remember what the other commits do when reviewing one of them. What's more, when reading "hg log" months or years later, it is hard to make sense of a single commit because you don't really know what issue it was meant to contribute to fix.

I know that's how Mercurial devs do things, but I don't really like it.

I think the best of both worlds is to encourage contributors to send more complicated patches in a series of easy-to-review steps, but when committing to Python, make one changeset out of them.

I don't see how this addresses Antoine's problem of connecting commits to issues at all.

I wasn't addressing Antoine's original problem, rather his reply to Dirkjan.

Huh? Are you referring to something other than the part of his post that you quoted? Antoine writes "you have to remember what the other commits do when reviewing them" and "it is hard to make sense of a single commit [in a series] because you don't know what issue it was meant to fix".

I admit I'm not really sure what his issue is. It seems to me that connecting commits is what a feature branch (in conjunction with rebase) is designed to achieve. If you don't like rebase, you can either work fast enough that your whole sequence is done before the mainline moves on significantly, or you can refrain from updating until done (and have a potentially messy merge), or you can use MQ (which is really just a way of rebasing without the shame ;-).

I'll have to test it, but AFAIK in all of the above strategies, as long as you don't push to the public repo until done, the logs of the commits on the feature branch should all be adjacent in the natural order of hg log. That seems to me to be the optimal strategy, in combination with reading long parts of history in a graphical DAG browser.

Of course, that assumes that random pieces of the fix aren't dispersed among commits. In that case the logs will still be hard to read and understand, as will the diffs. People who like to commit early and often should indeed be encouraged to edit their feature branches to make each individual commit make sense to reviewers. (MQ helps to address this, as does Bazaar's loom feature or StGit.) Feature branches don't automatically organize commits in an intelligible way, that requires an intelligence driving the process. But they do make it possible.

Once you have feature branches, then there's a question of the external issue. Here reviewers should pay attention to the log message, and make sure it describes the problem well, and includes cross references to any documentation (tracker issue or ML thread). But that's no different from the current process.

I think that in many cases the process of coming up with coherent changesets that are reviewable will indeed result in a single commit to address the whole issue. But there will also be multicommit patterns that make sense, such as "refactor API and update current clients -> use new feature in a few places". The thing to remember is that DVCSes not only record a frozen view of history accurately, but can also be used to flexibly reorganize the presentation of that history "as it should have happened".

I think of these workflows as opportunities to improve the quality of information presented by the history. But they aren't mandated by adopting hg. Contributors and reviewers who are satisfied with the current process should continue to refine a set of changes to a single commit. hg is certainly flexible enough to allow that, with several different workflows. And Antoine's worries (AIUI) are not unfounded. Eg, we should not allow people to be lazy and submit a feature branch with changes randomly assigned to different commits and log messages like "Lunch time! commit progress to date." But that's a social problem; I think that conventions will quickly evolve from the one patch per issue workflow to a well-organized feature branch per issue (as appropriate) because python-dev reviewers will demand it.



More information about the Python-ideas mailing list