[Python-Dev] PEP 414 - Unicode Literals for Python 3 (original) (raw)

Nick Coghlan ncoghlan at gmail.com
Tue Feb 28 02:45:48 CET 2012

Previous message: [Python-Dev] PEP 414 - Unicode Literals for Python 3
Next message: [Python-Dev] PEP 414 - Unicode Literals for Python 3
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Tue, Feb 28, 2012 at 9:19 AM, Terry Reedy <tjreedy at udel.edu> wrote:

Since writing the above, I realized that the following is a realistic scenario. 2.6 or 2.7 code a) uses has/set/getattr, so unicode literals would require a change; b) uses non-ascii chars in unicode literals; c) uses (or could be converted to use) print as a function; and d) otherwise uses a common 2-3 subset. Such would only need the u prefix addition to run under both Pythons. This works the other way, of course, for backporting code. So I am replacing 'most' with 'some unknown-to-me fraction' ;-).

Yep, that's exactly the situation I'm in with PulpDist (a web app that primarily targets deployment on RHEL 6, which means Python 2.6). Since I preformat all my print output with either str.format or str.join (or use the logging module) and always use "except exc as var" to catch exceptions, the natural way to write Python 2 code for me is almost source compatible with Python 3. The only big discrepancy I'm currently aware of? Unicode literals.

Now, I could retrofit the entire code base with the unicode_literals import and str("") for native strings, but that has problems of its own:

it doesn't match the Pulp upstream, so it would make it harder for them to review my plugins and client API usage code (or integrate them into the default plugin set or client support API if they decide they like them). Given that I'm one of the guinea pigs for experimental Pulp APIs and have to dive into their code on occasion, it would also be a challenge for me to switch modes when debugging .
it doesn't match Django (at least, not in 1.3, which is the version I'm using) (another potential annoyance when debugging)
it doesn't match any of the other Django applications I use (once again, debugging may lead to me looking at this code)
it doesn't match the standard library (yep, you guessed it, I'd have to mode switch when looking at standard library code, too)
it doesn't match the intuitions of current Python 2 developers that aren't up to speed with the niceties of Python 3 porting

Basically, using the unicode_literals import would significantly raise the barrier to entry for PulpDist as a Python 2 project, as well as forcing me to switch mental models for text processing whenever I have to look at the code in a dependency during a debugging session. Therefore, given that Python 2 will be my primary target for the immediate future (and any collaborators are likely to be RHEL 6 and hence Python 2 focused), I don't want to use that particular future import. The downside of that choice (currently) is that it kills any possibility of running any of it on Python 3, even the command line client or the web front end after Django gets ported. With explicit unicode literals being restored in Python 3.3, though, I'm a lot more optimistic about the feasibility of porting it without too much effort (as well as the prospect of other Django app dependencies gaining Python 3 support).

In terms of third party upstreams, python 3 compatibility patches that affect every single string literal in the entire project (either directly or converting the entire project to the "unicode_literals" import) aren't likely to even get reviewed, let alone accepted. By contrast (for a project that already only supports 2.6+), cleaning up print statements and exception handling should be a much smaller patch that is easy to both review and accept. Making it as easy as possible for maintainers that don't really care about Python 3 to accept patches from people that do care is a very good thing.

There are still other problems that are going to affect the folks playing at the wire protocol level, but the lack of unicode literals is a big one that affects the entire application stack.

Cheers, Nick.

-- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia

Previous message: [Python-Dev] PEP 414 - Unicode Literals for Python 3
Next message: [Python-Dev] PEP 414 - Unicode Literals for Python 3
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the Python-Dev mailing list