Python 3 upgrade (original) (raw)

About ten years ago, Guido Van Rossum, the Python author and Benevolent Dictator for Life (BDFL), along with the Python community, decided to make several concurrent backward incompatible changes to Python 2.5 and release a new version, Python 3.0.



The main changes were:

Unfortunately, this list was comprehensive enough to break virtually every python script ever written. So, to ease the transition, 3.0 and 2.6 were released simultaneously, with the other, backward compatible new features of 3.0 being also included in 2.6. This happened again with the releases of 3.1 and 2.7. Not wanting to maintain two Pythons, the BDFL declared that 2.7 was the last Python 2 series release.

These changes (mostly the unicode one) also made Python much slower in version 3.0. Since then, however, there have been many speed and memory improvements. Combined with new C extensions for some modules, Python 3 is now usually as fast or faster than Python 2.

The original, officially sanctioned upgrade path was one of the biggest issues with moving to Python 3. A script, 2to3, was supposed to convert code to Python 3, and then the old version could be eventually dropped. This script required a lot of manual intervention (things like the unicode strings require knowledge of the programmer’s intent), and required library authors to maintain two separate versions of the code. This hindered initial adoption with many major libraries unwilling to support two versions for Python 3 support.

Unofficial authors tried making a new script, 3to2, which worked significantly better, but still was hindered by the dual copies of code issue.

Another decision also may have slowed adoption. Part way through the development of Python 3.2 up to 3.4, the decision was made to avoid adding any new features, to give authors time to adopt code to a stable Python 3. This statement could be taken in reverse; why update to Python 3 when it does not have any new features to improve your program? The original changes (as listed above) were not enough to cause mass adoption.

This dreary time in Python development is now drawing to a close, thanks to a change in the way authors started approaching Python compatibility. There is such a good overlap between Python 2.6 or Python 2.7 and Python 3.3+ that a single code base can support them both. The reason for this is the following three things:

These were capitalized by the unofficial library authors, and now almost every library is available as a single code base for Python 2 and 3. Most of the new standard libraries, and even a few language features, are regularly backported to Python 2, as well.

Libraries to ease in the transition

Six

The original compatibility library, six (so named because 2 times 3 is 6), provides tools to make writing 2 and 3 compatible code easy. You just import six, and then access the renamed standard libraries from six.moves. There are wrappers for the changed features, such assix.with_metaclass.

These features are not hard to wrap yourself, so many libraries implement their own six wrapper to reduce dependencies and overhead.

See also: Future library (click to expand)

Future

This is a newer library with a unique approach. Instead of forcing a usage of a special wrapper, the idea of future is to simply allow code to be written in Python 3, but work in Python 3. For example,from builtins import input will do nothing on Python 3 (builtins is whereinput lives), but on Python 2 with future installed, builtins is part offuture and will import the future version. You can even patch in the Python 3 standard library names with a standard_library.install_aliases() function.

Future also comes with it’s own version of the conversion scripts, calledfuturize and pasteurize, which use the future library to make code that runs on one version run on both versions. An alpha feature, the autotranslatefunction, can turn a library that supports only Python 2 into a Python 3 version on import.

Backports

Several of the new libraries and features have been backported to Python 2. I’m not including ones that were backported in an official Python release, likeargparse.

New features in modern Python

These are features that have been released in a version of Python after 3.0 that are not in the older Python 2 series:

Formatted string literals (3.6)

Finally! You can write code such as the following now:

x = 2
print(f"The value of x is {x}")

This is indicated by the f prefix, and can take almost any valid python expression. It does not have the scope issues that the old workaround,.format(**locals()) encounters.

In Python 3.8, you can use an equals:

to print a variable or expression and its name:

x = 2

Syntax for variable annotations (3.6)

This will be great for type hints, IDE’s, and Cython, but the syntax is a little odd for Python. It’s based on function annotations. A quick example:

an_empty_list_of_ints: List[int] = []
will_be_a_str_later: str

This stores the variable name and the annotation in an __annotations__dictionary for the module or the class that they are in.

Simi-ordered dictionaries (3.6 and 3.7)

Python dictionaries are now partially ordered; due to huge speedups in the C definition of ordered dicts, the dict class is now guaranteed to iterate in order as long as nothing has been changed since the dict creation. This may sound restrictive, but it enables many features; you can now discover the order keyword arguments were passed, the order class members were added, and the order of {} dicts. If you want to continue to keep or control the order, you should move the dict to an OrderedDict, as before. This makes ordered dictionaries much easier to create, too.

Only class member order and keyword argument order are ensured by the language; the ordering of {} is an implementation detail. This detail works in both CPython 3.6 and all versions PyPy, however. This became language mandated in Python 3.7.

DataClasses (3.7)

Most programmers coming from other languages want some form of class designed to store data. Creation of these data-centric classes is verbose and ugly in python, since you have to put all the setup in the __init__ method rather than directly in the class like other languages, and you have to manage initialization, print, comparison, etc. yourself. Now, with DataClasses, you can do it with a nice syntax:

from dataclasses import dataclass

@dataclass
class Vector:
    x: float
    y: float
    z: float

This will create (by default) __init__, __repr__, and __eq__. You can also ask for order, unsafe_hash, and frozen.

This is similar to, and less powerful than, the popular attrs library (available for all versions of Python). This library module, like many others, was also backported to older versions of Python. However, the variable type annotations are not available in older versions.

Walrus operator (3.8)

You can now use a special assignment operator, := (called the walrus operator due to the eyes + tusks appearance) almost anywhere that a normal = was not allowed. So, for example, you can now do this:

if x := long_check():
    print(x)
# x is no longer in scope!

This might be very handy for setting up machine learning tools, where you set a number of layers then refer to it further down in the same dict or function call.

Other smaller features:

Status of Python

The current status of the python releases is as follows:

Further reading