[Python-Dev] PEP 414 - Unicode Literals for Python 3 (original) (raw)

Ezio Melotti ezio.melotti at gmail.com
Tue Feb 28 15:20:46 CET 2012

Previous message: [Python-Dev] PEP 414 - Unicode Literals for Python 3
Next message: [Python-Dev] PEP 414 - Unicode Literals for Python 3
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 28/02/2012 14.19, Antoine Pitrou wrote:

Le mardi 28 février 2012 à 22:14 +1000, Nick Coghlan a écrit :

If you're using separate branches, then your Python 2 code isn't being made forward compatible with Python 3. Yes, it avoids making your Python 2 code uglier, but it means maintaining two branches in parallel until you drop Python 2 support. IMO, maintaining two branches shouldn't be much more work than maintaining hacks so that a single codebase works with two different programming languages.

+10

For every CPython bug that I fix I first apply the patch on 2.7, then on 3.2 and then on 3.3. Most of the time I don't even need to change anything while applying the patch to 3.2, sometimes I have to do some trivial fixes. This is also true for another personal 12kloc project* where I'm using the two-branches approach.

For me, the costs of having two branches are:

a one-time conversion when the Python3-compatible branch is created (can be done easily with 2to3);
merging the fix I apply to the Python2 branch (and with modern DVCS this is not really an issue).

With the shared code base approach, the costs are:

a one-time conversion to "fix" the code base and make it run on both 2.x and 3.x;
keep using and having to deal with hacks in order to keep it running.

With the first approach, you also have two clean and separate code bases, with no hacks; when you stop using Python 2, you end up with a clean Python 3 branch. The one-time conversion also seems easier in the first case.

(Note: there are also other costs -- e.g. releasing -- that I haven't considered because they don't affect me personally, but I'm not sure they are big enough to make the two-branches approach worse.)

You've once again raised the barrier to entry: either people contribute two patches, or they accept that their patch may languish until someone else writes the patch for the other version. Again that's wrong. If you cleverly use 2to3 to port between branches, patches only have to be written against the 2.x version.

After the initial conversion of the code base, the fixes are mostly trivial, so people don't need to write two patches (most of the patches we get for CPython are either against 2.7 or 3.2, and sometimes they even apply clearly to both).

Using 2to3 to generate the 3.x code automatically for every change applied to the 2.x branch (or to convert everything when a new package is installed) sounds wrong to me. I wouldn't trust generated code even if 2to3 was a better tool.

That said, I successfully used the shared code base approach with print_function, unicode_literals, and a couple of try/except for the imports for a few one-file scripts (for 2.7/3.2) that I wrote recently.

TL;DR the two-branches approach usually works better (at least IME) than the shared code base approach, doesn't necessarily require more work, and doesn't need ugly hacks to work.

in this case all the string literals I had were already text (rather than bytes) and even without using unicode_literals they worked out of the box when I moved the code to 3.x. There was however a place where it didn't work, and that turned out to be a bug even in Python 2 because I was mixing bytes and text.

Best Regards, Ezio Melotti

Regards

Antoine.

Previous message: [Python-Dev] PEP 414 - Unicode Literals for Python 3
Next message: [Python-Dev] PEP 414 - Unicode Literals for Python 3
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the Python-Dev mailing list