[Python-Dev] SVN <-> HG workflow to split Python Library by Module (original) (raw)

Brett Cannon brett at python.org
Fri Jul 2 23:37:08 CEST 2010


On Fri, Jul 2, 2010 at 12:25, anatoly techtonik <techtonik at gmail.com> wrote:

I planned to publish this proposal when it is finally ready and tested with an assumption that Subversion repository will be online and up-to-date after Mercurial migration. But recent threads showed that currently there is no tested mechanism to sync Subversion repository back with Mercurial, so it will probably quickly outdate, and the proposal won't have a chance to be evaluated. So now is better than never.

So, this is a way to split modules from monolithic Subversion repository into several Mercurial mirrors - one mirror for each module (or whatever directory structure you like). This will allow to concentrate your work on only one module at a time ("distutils", "CGIHTTPServer" etc.) without caring much about anything else. Exceptionally useful for occasional external "contributors" like me, and folks on Windows, who don't possess Visual Studio to compile Python and are forced to use whatever version they have installed to create and test patches.

But modules do not live in an isolated world; they are dependent on changes made to other modules. Isolating them from other modules whose semantics change during development will lead to skew and improper patches.

As for Windows users who don't have Visual Studio, there is a free version that compiles Python fine: http://www.python.org/dev/faq/#id8 .

-Brett

Here is a picture if you feel bored - https://docs.google.com/drawings/edit?id=1c9FDQ27BnaIew1T7Tr-rFg1OCdPVS9w3TQREOkzyjk&hl=en An example of the split distutils module - http://bitbucket.org/techtonik/distutils The split is not perfect, but the process can be polished - it is the first version I managed to get only this morning. More important is that HG repository is incrementally synchronized. The split is not perfect, because in particular I see that documentation dir is not sucked in. But it is a working proof on concept you can test yourself using the code from: http://bitbucket.org/techtonik/python-split You will also need patched version of hgsvn from http://bitbucket.org/techtonik/hgsvn How does it work ------------------------- The module is described as a series of paths inside typical Subversion checkout. On the first run refresh.py script from python-split creates shallow SVN checkout with only required files using distutils.module.def module definition Second run of refresh.py imports shallow checkout into Mercurial And the third run imports the rest of the history pulling only changesets relevant to given paths. Workflow ------------- Diagram showed patches that are pulled from local clones of split repositories to master Mercurial mirror, but it won't work this way, because hashes of revisions in direct mirror wont't match hashes in split repositories - that's why some hash lookup/sync procedure is needed to correctly process incoming patches. This workflow works with hash sync only when changes are pushed back to central Subversion repository from local clones (possibly through another intermediate normalizing repository). Changes pushed this way are streamlined and could be downloaded into stable branch of other mirrors as a single line of development. I borrowed streamlining concept from Go contribution guide as it really helps to review chaotic Mercurial commits. http://golang.org/doc/contribute.html#Codereview Maintaining centralized Subversion repository will require additional properties to be set, but this is doable. I don't how to make module split with Mercurial alone. http://mercurial.selenic.com/wiki/ShallowClone is still a draft (and complicated one) and Mercurial 1.6 that released today doesn't contain anything revolutionary to propose an alternative. I am exhausted. -- anatoly t.


Python-Dev mailing list Python-Dev at python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/brett%40python.org



More information about the Python-Dev mailing list