[Python-Dev] Move selected documentation repos to PSF BitBucket account? (original) (raw)
Nick Coghlan ncoghlan at gmail.com
Mon Nov 24 08:25:21 CET 2014
- Previous message: [Python-Dev] Move selected documentation repos to PSF BitBucket account?
- Next message: [Python-Dev] Move selected documentation repos to PSF BitBucket account?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On 24 November 2014 at 02:55, Brett Cannon <brett at python.org> wrote:
On Sun Nov 23 2014 at 6🔞46 AM Nick Coghlan <ncoghlan at gmail.com> wrote:
Those features are readily accessible without changing the underlying version control system (whether self-hosted through Kallithea or externally hosted through BitBucket or RhodeCode). Thus the folks that want to change the version control system need to make the case that doing so will provide additional benefits that can't be obtained in a less disruptive way. I guess my question is who and what is going to be disrupted if we go with Guido's suggestion of switching to GitHub for code hosting? Contributors won't be disrupted at all since most people are more familiar with GitHub vs. Bitbucket (how many times have we all heard the fact someone has even learned Mercurial just to contribute to Python?). Core developers might be based on some learned workflow, but I'm willing to bet we all know git at this point (and for those of us who still don't like it, myself included, there are GUI apps to paper over it or hg-git for those that prefer a CLI). Our infrastructure will need to be updated, but how much of it is that hg-specific short of the command to checkout out the repo? Obviously Bitbucket is much more minor by simply updating just a URL, but changing
hg_ _clone
togit clone
isn't crazy either. Georg, Antoine, or Benjamin can point out if I'm wrong on this, maybe Donald or someone in the infrastructure committee.
Are you volunteering to write a competing PEP for a migration to git and GitHub?
I won't be updating PEP 474 to recommend moving to either, as I don't think that would be a good outcome for the Python ecosystem as a whole. It massively undercuts any possible confidence anyone else might have in Mercurial, BitBucket, Rhodecode, Kallithea & Allura (all Python based version control, or version control hosting, systems). If we as the Python core development team don't think any of those are good enough to meet the modest version control needs of our support repos, why on earth would anyone else choose them?
In reality, I think most of these services are pretty interchangeable
- GitHub's just been the most effective at the venture capital powered mindshare grab business model (note how many of the arguments here stem from the fact folks like other things that only interoperate with GitHub, and no other repository hosting providers - that's the core of the A18z funded approach to breaking the "D" in DVCS and ensuring that GitHub's investors are in a position to clip the ticket when GitHub eventually turns around and takes advantage of its dominant market position to increase profit margins).
That's why I consider it legitimate to treat supporting fellow Python community members as the determining factor - a number of the available options meet the "good enough" bar from a technical perspective, so it's reasonable to take other commercial and community factors into account when making a final decision.
Probably the biggest thing I can think of that would need updating is our commit hooks. Once again Georg, Antoine, or Benjamin could say how difficult it would be to update those hooks.
If CPython eventually followed suit in migrating to git (as seems inevitable if all the other repos were to switch), then every buildbot will also need to be updated to have git installed (and Mercurial removed).
From my perspective, swapping out Mercurial for git achieves exactly nothing in terms of alleviating the review bottleneck (since the core developers that strongly prefer the git UI will already be using an adapter), and is in fact likely to make it worse by putting the greatest burden in adapting to the change on the folks that are already under the greatest time pressure. That's not entirely true. If you are pushing a PR shift in our patch acceptance workflow then Bitbucket vs. GitHub isn't fundamentally any different in terms of benefit, and I would honestly argue that GitHub's PR experience is better. IOW either platform is of equal benefit.
Yes, I agree any real benefit comes from the PR workflow, not from git. This is why I consider "written in Python" to be a valid determining factor - multiple services meet the "good enough" bar from a practical perspective, allowing other considerations to come to the fore.
(Also note that this proposal does NOT currently cover CPython itself. Neither GitHub nor BitBucket is set up to handle maintenance branches well, and any server side merge based workflow improvements for CPython are gated on fixing the NEWS file maintenance issue. However, once you contemplate moving CPython, then the ripple effects on other systems become much larger)
It's also worth keeping in mind that changing the underlying VCS means changing all the automation scripts, rather than just updating the configuration settings to reflect a new hosting URL. What are the automation scripts there are that would require updating? I would like to a list and to have the difficulty of moving them mentioned to know what the impact would be.
For the documentation repos, just the devguide and PEP update scripts come to mind. As noted above, the implications get more significant if the main CPython repo eventually follows suit and the buildbot infrastructure all needs to be updated.
Orchestrating this kind of infrastructure enhancement for Red Hat is my day job, and you almost always want to go for the lowest impact, lowest risk approach that will alleviate the bottleneck you're worried about while changing the smallest possible number of elements in the overall workflow management system. Sure, but I would never compare our infrastructure needs to Red Hat. =) You also have to be conservative in order to minimize downtown and impact for cost reasons. As an open source project we don't have those kinds of worry; we just have to worry about keeping everyone happy.
Switching to a proprietary hosting service written in a mixture of Ruby, C & bash wouldn't make me happy.
If that's the end result of this thread, I'll be sorry I even suggested the idea of reverting to external hosting at all. That outcome would be the antithesis of the PSF's overall mission, whereas I started this thread at least in part due to a discussion about ways the PSF board might be able to help resolve some of the current CPython workflow issues. Offering the use of the PSF's existing BitBucket org account as hosting location for Mercurial repos was an idea I first brought up in that PSF board thread, and then moved over here since it seemed worthwhile to at least make the suggestion and see what people thought.
That underlying calculation doesn't really change much even when the units shift from budget dollars to volunteers' time and energy. So here is what I want to know to focus this discussion: First, what new workflow are you proposing regardless of repo hosting provider? Are you proposing we maintain just mirrors and update the devguide to tell people to fork on the hosting provider, make their changes, generate a patch (which can be as simple as telling people how find the raw diff on the hosting provider), and then upload the patch the issue tracker just like today? Are you going farther and saying we have people send PRs on the hosting site, have them point to their PR in the issue tracker, and then we accept PRs (I'm going on the assumption we are not dropping our issue tracker)?
I am proposing that we switch at least some documentation-only repos to a full PR based workflow, including support for online editing (to make it easy to fix simple typos and the like without even leaving the browser). CPython itself would remain completely unaffected.
The proposal in PEP 474 is that we do that by setting up Kallithea as forge.python.org. This thread was about considering BitBucket as an alternative approach. RhodeCode would a third option that still didn't involve switching away from Mercurial.
Second, to properly gauge the impact of switching from git to hg from an infrastructure perspective, what automation scripts do we have and how difficult would it be to update them to use git instead of hg? This is necessary simply to know where we would need to update URLs, let alone change in DVCS.
The problems with changing version control systems don't really become significant until we start talking about switching CPython itself, rather than the support repos.
Third, do our release managers care about hg vs. git strongly? They probably use the DVCS the most directly and at a lower level by necessity compared to anyone else.
Fourth, do any core developers feel strongly about not using GitHub? Now please notice I said "GitHub" and not "git"; I think the proper way to frame this whole discussion is we are deciding if we want to switch to Bitbucket or GitHub who provide a low-level API for their version control storage service through hg or git, respectively. I personally dislike git, but I really like GitHub and I don't even notice git there since I use GitHub's OS X app; as I said, I view this as choosing a platform and not the underlying DVCS as I have happily chosen to access the GitHub hosting service through an app that is not git (it's like accessing a web app through it's web page or its REST API).
Yes, I object strongly to the use of GitHub when there are commercially supported services written in Python like BitBucket and RhodeCode available if we want to go the external hosting route, and other options like the RhodeCode derived Kallithea if we want to run a self-hosted forge. RhodeCode are even PSF sponsors - I'm sure they'd be willing to discuss the possibility of hosting core development repos on their service.
If I was doing a full risk management breakdown, then RhodeCode would be the obvious winner, as not only are they PSF sponsors, but reverting to self-hosting on Kallithea would remain available as an exit strategy.
I only suggested BitBucket in this thread because the PSF already has some repos set up there, so that seemed easier than establishing a new set of repos on a RhodeCode hosted instance.
At least for me, until we get a clear understanding of what workflow changes we are asking for both contributors and core developers and exactly what work would be necessary to update our infrastructure for either Bitbucket or GitHub we can't really have a reasonable discussion that isn't going to be full of guessing.
All repos that migrated away from hg.python.org would move to a PR based workflow, rather than manual patch management on the issue tracker. The migrated repos would likely also use their integrated issue tracker rather than the main CPython one at bugs.python.org.
Externally hosted repos would likely retain a regularly updated mirror on hg.python.org to ensure the source remains available even in the event of problems affecting the external hosting provider.
And I'm still in support no matter what of breaking out the HOWTOs and the tutorial into their own repos for easier updating (having to update the Python porting HOWTO in three branches is a pain when it should be consistent across Python releases).
Agreed.
Cheers, Nick.
-- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia
- Previous message: [Python-Dev] Move selected documentation repos to PSF BitBucket account?
- Next message: [Python-Dev] Move selected documentation repos to PSF BitBucket account?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]