[Python-Dev] PEP 414 - Unicode Literals for Python 3 (original) (raw)

Vinay Sajip vinay_sajip at yahoo.co.uk
Tue Feb 28 17:08:06 CET 2012

Previous message: [Python-Dev] PEP 414 - Unicode Literals for Python 3
Next message: [Python-Dev] PEP 414 - Unicode Literals for Python 3
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Ezio Melotti <ezio.melotti gmail.com> writes:

For every CPython bug that I fix I first apply the patch on 2.7, then on 3.2 and then on 3.3. Most of the time I don't even need to change anything while applying the patch to 3.2, sometimes I have to do some trivial fixes. This is also true for another personal 12kloc project* where I'm using the two-branches approach.

I hear what you say about the personal project, but IMO CPython is atypical (as far as this discussion is concerned), not least because it's not a pure-Python project.

For me, the costs of having two branches are: 1) a one-time conversion when the Python3-compatible branch is created (can be done easily with 2to3);

Yes, but the amount of ease is project-dependent. For example, 2to3 wraps values() method calls with list(), which is a reasonable thing to do for dicts; when presented Django's querysets, which have a values() method which should not be wrapped, then you have to go through and sort things out. I'm not knocking 2to3, which I think is great. Just that things go well sometimes, and less well at other times,

With the shared code base approach, the costs are: 1) a one-time conversion to "fix" the code base and make it run on both 2.x and 3.x; 2) keep using and having to deal with hacks in order to keep it running.

Which hacks do you mean, if you're only interested in 2.6+?

With the first approach, you also have two clean and separate code bases, with no hacks; when you stop using Python 2, you end up with a clean Python 3 branch. The one-time conversion also seems easier in the first case.

(Note: there are also other costs -- e.g. releasing -- that I haven't considered because they don't affect me personally, but I'm not sure they are big enough to make the two-branches approach worse.)

I don't believe there's a one-size-fits-all. The two branches approach is appealing, and I have no quarrel with it: but I contend that big projects like Django would be reluctant to switch, or take much longer to switch to 3.x, if they had to maintain separate branches. Given the size of their user community, they have to follow strict release procedures, which (even with just running on 2.x) smaller projects can be more relaxed about.

You forgot to mention the part which is most time-consuming day-to-day: making changes and testing. For the two-branch approach, its

Change on 2.x
Test on 2.x
Commit on 2.x
Merge to 3.x
Possibly change on 3.x
Test on 3.x
Commit on 3.x

where each "test" step, if failures occur, might take you back to a previous "change" step.

For the single codebase, that's

Change
Test on 2.x
Test on 3.x
Commit

This, to me, is the single big advantage of the single codebase approach, and the productivity improvements outweigh code purity issues which are, in the grand scheme of things, not all that large.

Another advantage is DRY: you don't have to worry about forgetting to merge some changes from 2.x to 3.x. Haven't we all been there one time or another? I know I have, though I try not to make a habit of it ;-)

After the initial conversion of the code base, the fixes are mostly trivial, so people don't need to write two patches (most of the patches we get for CPython are either against 2.7 or 3.2, and sometimes they even apply clearly to both).

Fixes may be trivial, but new features might not always be so.

Regards,

Vinay Sajip

Previous message: [Python-Dev] PEP 414 - Unicode Literals for Python 3
Next message: [Python-Dev] PEP 414 - Unicode Literals for Python 3
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the Python-Dev mailing list