What’s New in Python 2.6 (original) (raw)

Author:

A.M. Kuchling (amk at amk.ca)

This article explains the new features in Python 2.6, released on October 1, 2008. The release schedule is described in PEP 361.

The major theme of Python 2.6 is preparing the migration path to Python 3.0, a major redesign of the language. Whenever possible, Python 2.6 incorporates new features and syntax from 3.0 while remaining compatible with existing code by not removing older features or syntax. When it’s not possible to do that, Python 2.6 tries to do what it can, adding compatibility functions in afuture_builtins module and a -3 switch to warn about usages that will become unsupported in 3.0.

Some significant new packages have been added to the standard library, such as the multiprocessing and json modules, but there aren’t many new features that aren’t related to Python 3.0 in some way.

Python 2.6 also sees a number of improvements and bugfixes throughout the source. A search through the change logs finds there were 259 patches applied and 612 bugs fixed between Python 2.5 and 2.6. Both figures are likely to be underestimates.

This article doesn’t attempt to provide a complete specification of the new features, but instead provides a convenient overview. For full details, you should refer to the documentation for Python 2.6. If you want to understand the rationale for the design and implementation, refer to the PEP for a particular new feature. Whenever possible, “What’s New in Python” links to the bug/patch item for each change.

Python 3.0

The development cycle for Python versions 2.6 and 3.0 was synchronized, with the alpha and beta releases for both versions being made on the same days. The development of 3.0 has influenced many features in 2.6.

Python 3.0 is a far-ranging redesign of Python that breaks compatibility with the 2.x series. This means that existing Python code will need some conversion in order to run on Python 3.0. However, not all the changes in 3.0 necessarily break compatibility. In cases where new features won’t cause existing code to break, they’ve been backported to 2.6 and are described in this document in the appropriate place. Some of the 3.0-derived features are:

Python 3.0 adds several new built-in functions and changes the semantics of some existing builtins. Functions that are new in 3.0 such as bin() have simply been added to Python 2.6, but existing builtins haven’t been changed; instead, the future_builtinsmodule has versions with the new 3.0 semantics. Code written to be compatible with 3.0 can do from future_builtins import hex, map as necessary.

A new command-line switch, -3, enables warnings about features that will be removed in Python 3.0. You can run code with this switch to see how much work will be necessary to port code to 3.0. The value of this switch is available to Python code as the boolean variable sys.py3kwarning, and to C extension code as Py_Py3kWarningFlag.

See also

The 3_xxx_ series of PEPs, which contains proposals for Python 3.0.PEP 3000 describes the development process for Python 3.0. Start with PEP 3100 that describes the general goals for Python 3.0, and then explore the higher-numbered PEPs that propose specific features.

Changes to the Development Process

While 2.6 was being developed, the Python development process underwent two significant changes: we switched from SourceForge’s issue tracker to a customized Roundup installation, and the documentation was converted from LaTeX to reStructuredText.

New Issue Tracker: Roundup

For a long time, the Python developers had been growing increasingly annoyed by SourceForge’s bug tracker. SourceForge’s hosted solution doesn’t permit much customization; for example, it wasn’t possible to customize the life cycle of issues.

The infrastructure committee of the Python Software Foundation therefore posted a call for issue trackers, asking volunteers to set up different products and import some of the bugs and patches from SourceForge. Four different trackers were examined: Jira,Launchpad,Roundup, andTrac. The committee eventually settled on Jira and Roundup as the two candidates. Jira is a commercial product that offers no-cost hosted instances to free-software projects; Roundup is an open-source project that requires volunteers to administer it and a server to host it.

After posting a call for volunteers, a new Roundup installation was set up at https://bugs.python.org. One installation of Roundup can host multiple trackers, and this server now also hosts issue trackers for Jython and for the Python web site. It will surely find other uses in the future. Where possible, this edition of “What’s New in Python” links to the bug/patch item for each change.

Hosting of the Python bug tracker is kindly provided byUpfront Systemsof Stellenbosch, South Africa. Martin von Löwis put a lot of effort into importing existing bugs and patches from SourceForge; his scripts for this import operation are athttps://svn.python.org/view/tracker/importer/ and may be useful to other projects wishing to move from SourceForge to Roundup.

New Documentation Format: reStructuredText Using Sphinx

The Python documentation was written using LaTeX since the project started around 1989. In the 1980s and early 1990s, most documentation was printed out for later study, not viewed online. LaTeX was widely used because it provided attractive printed output while remaining straightforward to write once the basic rules of the markup were learned.

Today LaTeX is still used for writing publications destined for printing, but the landscape for programming tools has shifted. We no longer print out reams of documentation; instead, we browse through it online and HTML has become the most important format to support. Unfortunately, converting LaTeX to HTML is fairly complicated and Fred L. Drake Jr., the long-time Python documentation editor, spent a lot of time maintaining the conversion process. Occasionally people would suggest converting the documentation into SGML and later XML, but performing a good conversion is a major task and no one ever committed the time required to finish the job.

During the 2.6 development cycle, Georg Brandl put a lot of effort into building a new toolchain for processing the documentation. The resulting package is called Sphinx, and is available fromhttps://www.sphinx-doc.org/.

Sphinx concentrates on HTML output, producing attractively styled and modern HTML; printed output is still supported through conversion to LaTeX. The input format is reStructuredText, a markup syntax supporting custom extensions and directives that is commonly used in the Python community.

Sphinx is a standalone package that can be used for writing, and almost two dozen other projects (listed on the Sphinx web site) have adopted Sphinx as their documentation tool.

See also

Documenting Python

Describes how to write for Python’s documentation.

Sphinx

Documentation and code for the Sphinx toolchain.

Docutils

The underlying reStructuredText parser and toolset.

PEP 343: The ‘with’ statement

The previous version, Python 2.5, added the ‘with’ statement as an optional feature, to be enabled by a from __future__ import with_statement directive. In 2.6 the statement no longer needs to be specially enabled; this means that with is now always a keyword. The rest of this section is a copy of the corresponding section from the “What’s New in Python 2.5” document; if you’re familiar with the ‘with’ statement from Python 2.5, you can skip this section.

The ‘with’ statement clarifies code that previously would usetry...finally blocks to ensure that clean-up code is executed. In this section, I’ll discuss the statement as it will commonly be used. In the next section, I’ll examine the implementation details and show how to write objects for use with this statement.

The ‘with’ statement is a control-flow structure whose basic structure is:

with expression [as variable]: with-block

The expression is evaluated, and it should result in an object that supports the context management protocol (that is, has __enter__() and __exit__()methods).

The object’s __enter__() is called before with-block is executed and therefore can run set-up code. It also may return a value that is bound to the name variable, if given. (Note carefully that variable is not assigned the result of expression.)

After execution of the with-block is finished, the object’s __exit__()method is called, even if the block raised an exception, and can therefore run clean-up code.

Some standard Python objects now support the context management protocol and can be used with the ‘with’ statement. File objects are one example:

with open('/etc/passwd', 'r') as f: for line in f: print line ... more processing code ...

After this statement has executed, the file object in f will have been automatically closed, even if the for loop raised an exception part-way through the block.

Note

In this case, f is the same object created by open(), because__enter__() returns self.

The threading module’s locks and condition variables also support the ‘with’ statement:

lock = threading.Lock() with lock: # Critical section of code ...

The lock is acquired before the block is executed and always released once the block is complete.

The localcontext() function in the decimal module makes it easy to save and restore the current decimal context, which encapsulates the desired precision and rounding characteristics for computations:

from decimal import Decimal, Context, localcontext

Displays with default precision of 28 digits

v = Decimal('578') print v.sqrt()

with localcontext(Context(prec=16)): # All code in this block uses a precision of 16 digits. # The original context is restored on exiting the block. print v.sqrt()

Writing Context Managers

Under the hood, the ‘with’ statement is fairly complicated. Most people will only use ‘with’ in company with existing objects and don’t need to know these details, so you can skip the rest of this section if you like. Authors of new objects will need to understand the details of the underlying implementation and should keep reading.

A high-level explanation of the context management protocol is:

Let’s think through an example. I won’t present detailed code but will only sketch the methods necessary for a database that supports transactions.

(For people unfamiliar with database terminology: a set of changes to the database are grouped into a transaction. Transactions can be either committed, meaning that all the changes are written into the database, or rolled back, meaning that the changes are all discarded and the database is unchanged. See any database textbook for more information.)

Let’s assume there’s an object representing a database connection. Our goal will be to let the user write code like this:

db_connection = DatabaseConnection() with db_connection as cursor: cursor.execute('insert into ...') cursor.execute('delete from ...') # ... more operations ...

The transaction should be committed if the code in the block runs flawlessly or rolled back if there’s an exception. Here’s the basic interface forDatabaseConnection that I’ll assume:

class DatabaseConnection: # Database interface def cursor(self): "Returns a cursor object and starts a new transaction" def commit(self): "Commits current transaction" def rollback(self): "Rolls back current transaction"

The __enter__() method is pretty easy, having only to start a new transaction. For this application the resulting cursor object would be a useful result, so the method will return it. The user can then add as cursor to their ‘with’ statement to bind the cursor to a variable name.

class DatabaseConnection: ... def enter(self): # Code to start a new transaction cursor = self.cursor() return cursor

The __exit__() method is the most complicated because it’s where most of the work has to be done. The method has to check if an exception occurred. If there was no exception, the transaction is committed. The transaction is rolled back if there was an exception.

In the code below, execution will just fall off the end of the function, returning the default value of None. None is false, so the exception will be re-raised automatically. If you wished, you could be more explicit and add a return statement at the marked location.

class DatabaseConnection: ... def exit(self, type, value, tb): if tb is None: # No exception, so commit self.commit() else: # Exception occurred, so rollback. self.rollback() # return False

The contextlib module

The contextlib module provides some functions and a decorator that are useful when writing objects for use with the ‘with’ statement.

The decorator is called contextmanager(), and lets you write a single generator function instead of defining a new class. The generator should yield exactly one value. The code up to the yield will be executed as the__enter__() method, and the value yielded will be the method’s return value that will get bound to the variable in the ‘with’ statement’sas clause, if any. The code after the yield will be executed in the __exit__() method. Any exception raised in the block will be raised by the yield statement.

Using this decorator, our database example from the previous section could be written as:

from contextlib import contextmanager

@contextmanager def db_transaction(connection): cursor = connection.cursor() try: yield cursor except: connection.rollback() raise else: connection.commit()

db = DatabaseConnection() with db_transaction(db) as cursor: ...

The contextlib module also has a nested(mgr1, mgr2, ...) function that combines a number of context managers so you don’t need to write nested ‘with’ statements. In this example, the single ‘with’ statement both starts a database transaction and acquires a thread lock:

lock = threading.Lock() with nested (db_transaction(db), lock) as (cursor, locked): ...

Finally, the closing() function returns its argument so that it can be bound to a variable, and calls the argument’s .close() method at the end of the block.

import urllib, sys from contextlib import closing

with closing(urllib.urlopen('http://www.yahoo.com')) as f: for line in f: sys.stdout.write(line)

See also

PEP 343 - The “with” statement

PEP written by Guido van Rossum and Nick Coghlan; implemented by Mike Bland, Guido van Rossum, and Neal Norwitz. The PEP shows the code generated for a ‘with’ statement, which can be helpful in learning how the statement works.

The documentation for the contextlib module.

PEP 366: Explicit Relative Imports From a Main Module

Python’s -m switch allows running a module as a script. When you ran a module that was located inside a package, relative imports didn’t work correctly.

The fix for Python 2.6 adds a module.__package__ attribute. When this attribute is present, relative imports will be relative to the value of this attribute instead of the__name__ attribute.

PEP 302-style importers can then set __package__ as necessary. The runpy module that implements the -m switch now does this, so relative imports will now work correctly in scripts running from inside a package.

PEP 370: Per-user site-packages Directory

When you run Python, the module search path sys.path usually includes a directory whose path ends in "site-packages". This directory is intended to hold locally installed packages available to all users using a machine or a particular site installation.

Python 2.6 introduces a convention for user-specific site directories. The directory varies depending on the platform:

Within this directory, there will be version-specific subdirectories, such as lib/python2.6/site-packages on Unix/Mac OS andPython26/site-packages on Windows.

If you don’t like the default directory, it can be overridden by an environment variable. PYTHONUSERBASE sets the root directory used for all Python versions supporting this feature. On Windows, the directory for application-specific data can be changed by setting the APPDATA environment variable. You can also modify the site.py file for your Python installation.

The feature can be disabled entirely by running Python with the-s option or setting the PYTHONNOUSERSITEenvironment variable.

See also

PEP 370 - Per-user site-packages Directory

PEP written and implemented by Christian Heimes.

PEP 371: The multiprocessing Package

The new multiprocessing package lets Python programs create new processes that will perform a computation and return a result to the parent. The parent and child processes can communicate using queues and pipes, synchronize their operations using locks and semaphores, and can share simple arrays of data.

The multiprocessing module started out as an exact emulation of the threading module using processes instead of threads. That goal was discarded along the path to Python 2.6, but the general approach of the module is still similar. The fundamental class is the Process, which is passed a callable object and a collection of arguments. The start() method sets the callable running in a subprocess, after which you can call the is_alive() method to check whether the subprocess is still running and the join() method to wait for the process to exit.

Here’s a simple example where the subprocess will calculate a factorial. The function doing the calculation is written strangely so that it takes significantly longer when the input argument is a multiple of 4.

import time from multiprocessing import Process, Queue

def factorial(queue, N): "Compute a factorial." # If N is a multiple of 4, this function will take much longer. if (N % 4) == 0: time.sleep(.05 * N/4)

# Calculate the result
fact = 1L
for i in range(1, N+1):
    fact = fact * i

# Put the result on the queue
queue.put(fact)

if name == 'main': queue = Queue()

N = 5

p = Process(target=factorial, args=(queue, N))
p.start()
p.join()

result = queue.get()
print 'Factorial', N, '=', result

A Queue is used to communicate the result of the factorial. The Queue object is stored in a global variable. The child process will use the value of the variable when the child was created; because it’s a Queue, parent and child can use the object to communicate. (If the parent were to change the value of the global variable, the child’s value would be unaffected, and vice versa.)

Two other classes, Pool and Manager, provide higher-level interfaces. Pool will create a fixed number of worker processes, and requests can then be distributed to the workers by calling apply() or apply_async() to add a single request, and map() or map_async() to add a number of requests. The following code uses a Pool to spread requests across 5 worker processes and retrieve a list of results:

from multiprocessing import Pool

def factorial(N, dictionary): "Compute a factorial." ... p = Pool(5) result = p.map(factorial, range(1, 1000, 10)) for v in result: print v

This produces the following output:

1 39916800 51090942171709440000 8222838654177922817725562880000000 33452526613163807108170062053440751665152000000000 ...

The other high-level interface, the Manager class, creates a separate server process that can hold master copies of Python data structures. Other processes can then access and modify these data structures using proxy objects. The following example creates a shared dictionary by calling the dict() method; the worker processes then insert values into the dictionary. (Locking is not done for you automatically, which doesn’t matter in this example.Manager’s methods also include Lock(), RLock(), and Semaphore() to create shared locks.)

import time from multiprocessing import Pool, Manager

def factorial(N, dictionary): "Compute a factorial." # Calculate the result fact = 1L for i in range(1, N+1): fact = fact * i

# Store result in dictionary
dictionary[N] = fact

if name == 'main': p = Pool(5) mgr = Manager() d = mgr.dict() # Create shared dictionary

# Run tasks using the pool
for N in range(1, 1000, 10):
    p.apply_async(factorial, (N, d))

# Mark pool as closed -- no more tasks can be added.
p.close()

# Wait for tasks to exit
p.join()

# Output results
for k, v in sorted(d.items()):
    print k, v

This will produce the output:

1 1 11 39916800 21 51090942171709440000 31 8222838654177922817725562880000000 41 33452526613163807108170062053440751665152000000000 51 15511187532873822802242430164693032110632597200169861120000...

See also

The documentation for the multiprocessing module.

PEP 371 - Addition of the multiprocessing package

PEP written by Jesse Noller and Richard Oudkerk; implemented by Richard Oudkerk and Jesse Noller.

PEP 3101: Advanced String Formatting

In Python 3.0, the % operator is supplemented by a more powerful string formatting method, format(). Support for the str.format() method has been backported to Python 2.6.

In 2.6, both 8-bit and Unicode strings have a .format() method that treats the string as a template and takes the arguments to be formatted. The formatting template uses curly brackets ({, }) as special characters:

Substitute positional argument 0 into the string.

"User ID: {0}".format("root") 'User ID: root'

Use the named keyword arguments

"User ID: {uid} Last seen: {last_login}".format( ... uid="root", ... last_login = "5 Mar 2008 07:20") 'User ID: root Last seen: 5 Mar 2008 07:20'

Curly brackets can be escaped by doubling them:

"Empty dict: {{}}".format() "Empty dict: {}"

Field names can be integers indicating positional arguments, such as{0}, {1}, etc. or names of keyword arguments. You can also supply compound field names that read attributes or access dictionary keys:

import sys print 'Platform: {0.platform}\nPython version: {0.version}'.format(sys) Platform: darwin Python version: 2.6a1+ (trunk:61261M, Mar 5 2008, 20:29:41) [GCC 4.0.1 (Apple Computer, Inc. build 5367)]'

import mimetypes 'Content-type: {0[.mp4]}'.format(mimetypes.types_map) 'Content-type: video/mp4'

Note that when using dictionary-style notation such as [.mp4], you don’t need to put any quotation marks around the string; it will look up the value using .mp4 as the key. Strings beginning with a number will be converted to an integer. You can’t write more complicated expressions inside a format string.

So far we’ve shown how to specify which field to substitute into the resulting string. The precise formatting used is also controllable by adding a colon followed by a format specifier. For example:

Field 0: left justify, pad to 15 characters

Field 1: right justify, pad to 6 characters

fmt = '{0:15} ${1:>6}' fmt.format('Registration', 35) 'Registration $ 35' fmt.format('Tutorial', 50) 'Tutorial $ 50' fmt.format('Banquet', 125) 'Banquet $ 125'

Format specifiers can reference other fields through nesting:

fmt = '{0:{1}}' width = 15 fmt.format('Invoice #1234', width) 'Invoice #1234 ' width = 35 fmt.format('Invoice #1234', width) 'Invoice #1234 '

The alignment of a field within the desired width can be specified:

Character Effect
< (default) Left-align
> Right-align
^ Center
= (For numeric types only) Pad after the sign.

Format specifiers can also include a presentation type, which controls how the value is formatted. For example, floating-point numbers can be formatted as a general number or in exponential notation:

'{0:g}'.format(3.75) '3.75' '{0:e}'.format(3.75) '3.750000e+00'

A variety of presentation types are available. Consult the 2.6 documentation for a complete list; here’s a sample:

b Binary. Outputs the number in base 2.
c Character. Converts the integer to the corresponding Unicode character before printing.
d Decimal Integer. Outputs the number in base 10.
o Octal format. Outputs the number in base 8.
x Hex format. Outputs the number in base 16, using lower-case letters for the digits above 9.
e Exponent notation. Prints the number in scientific notation using the letter ‘e’ to indicate the exponent.
g General format. This prints the number as a fixed-point number, unless the number is too large, in which case it switches to ‘e’ exponent notation.
n Number. This is the same as ‘g’ (for floats) or ‘d’ (for integers), except that it uses the current locale setting to insert the appropriate number separator characters.
% Percentage. Multiplies the number by 100 and displays in fixed (‘f’) format, followed by a percent sign.

Classes and types can define a __format__() method to control how they’re formatted. It receives a single argument, the format specifier:

def format(self, format_spec): if isinstance(format_spec, unicode): return unicode(str(self)) else: return str(self)

There’s also a format() builtin that will format a single value. It calls the type’s __format__() method with the provided specifier:

format(75.6564, '.2f') '75.66'

See also

Format String Syntax

The reference documentation for format fields.

PEP 3101 - Advanced String Formatting

PEP written by Talin. Implemented by Eric Smith.

PEP 3105: print As a Function

The print statement becomes the print() function in Python 3.0. Making print() a function makes it possible to replace the function by doing def print(...) or importing a new function from somewhere else.

Python 2.6 has a __future__ import that removes print as language syntax, letting you use the functional form instead. For example:

from future import print_function print('# of entries', len(dictionary), file=sys.stderr)

The signature of the new function is:

def print(*args, sep=' ', end='\n', file=None)

The parameters are:

See also

PEP 3105 - Make print a function

PEP written by Georg Brandl.

PEP 3110: Exception-Handling Changes

One error that Python programmers occasionally make is writing the following code:

try: ... except TypeError, ValueError: # Wrong! ...

The author is probably trying to catch both TypeError andValueError exceptions, but this code actually does something different: it will catch TypeError and bind the resulting exception object to the local name "ValueError". TheValueError exception will not be caught at all. The correct code specifies a tuple of exceptions:

try: ... except (TypeError, ValueError): ...

This error happens because the use of the comma here is ambiguous: does it indicate two different nodes in the parse tree, or a single node that’s a tuple?

Python 3.0 makes this unambiguous by replacing the comma with the word “as”. To catch an exception and store the exception object in the variable exc, you must write:

try: ... except TypeError as exc: ...

Python 3.0 will only support the use of “as”, and therefore interprets the first example as catching two different exceptions. Python 2.6 supports both the comma and “as”, so existing code will continue to work. We therefore suggest using “as” when writing new Python code that will only be executed with 2.6.

See also

PEP 3110 - Catching Exceptions in Python 3000

PEP written and implemented by Collin Winter.

PEP 3112: Byte Literals

Python 3.0 adopts Unicode as the language’s fundamental string type and denotes 8-bit literals differently, either as b'string'or using a bytes constructor. For future compatibility, Python 2.6 adds bytes as a synonym for the str type, and it also supports the b'' notation.

The 2.6 str differs from 3.0’s bytes type in various ways; most notably, the constructor is completely different. In 3.0,bytes([65, 66, 67]) is 3 elements long, containing the bytes representing ABC; in 2.6, bytes([65, 66, 67]) returns the 12-byte string representing the str() of the list.

The primary use of bytes in 2.6 will be to write tests of object type such as isinstance(x, bytes). This will help the 2to3 converter, which can’t tell whether 2.x code intends strings to contain either characters or 8-bit bytes; you can now use either bytes or str to represent your intention exactly, and the resulting code will also be correct in Python 3.0.

There’s also a __future__ import that causes all string literals to become Unicode strings. This means that \u escape sequences can be used to include Unicode characters:

from future import unicode_literals

s = ('\u751f\u3080\u304e\u3000\u751f\u3054' '\u3081\u3000\u751f\u305f\u307e\u3054')

print len(s) # 12 Unicode characters

At the C level, Python 3.0 will rename the existing 8-bit string type, called PyStringObject in Python 2.x, to PyBytesObject. Python 2.6 uses #defineto support using the names PyBytesObject(),PyBytes_Check(), PyBytes_FromStringAndSize(), and all the other functions and macros used with strings.

Instances of the bytes type are immutable just as strings are. A new bytearray type stores a mutable sequence of bytes:

bytearray([65, 66, 67]) bytearray(b'ABC') b = bytearray(u'\u21ef\u3244', 'utf-8') b bytearray(b'\xe2\x87\xaf\xe3\x89\x84') b[0] = '\xe3' b bytearray(b'\xe3\x87\xaf\xe3\x89\x84') unicode(str(b), 'utf-8') u'\u31ef \u3244'

Byte arrays support most of the methods of string types, such asstartswith()/endswith(), find()/rfind(), and some of the methods of lists, such as append(),pop(), and reverse().

b = bytearray('ABC') b.append('d') b.append(ord('e')) b bytearray(b'ABCde')

There’s also a corresponding C API, withPyByteArray_FromObject(),PyByteArray_FromStringAndSize(), and various other functions.

See also

PEP 3112 - Bytes literals in Python 3000

PEP written by Jason Orendorff; backported to 2.6 by Christian Heimes.

PEP 3116: New I/O Library

Python’s built-in file objects support a number of methods, but file-like objects don’t necessarily support all of them. Objects that imitate files usually support read() and write(), but they may not support readline(), for example. Python 3.0 introduces a layered I/O library in the io module that separates buffering and text-handling features from the fundamental read and write operations.

There are three levels of abstract base classes provided by the io module:

In Python 2.6, the underlying implementations haven’t been restructured to build on top of the io module’s classes. The module is being provided to make it easier to write code that’s forward-compatible with 3.0, and to save developers the effort of writing their own implementations of buffering and text I/O.

See also

PEP 3116 - New I/O

PEP written by Daniel Stutzbach, Mike Verdone, and Guido van Rossum. Code by Guido van Rossum, Georg Brandl, Walter Doerwald, Jeremy Hylton, Martin von Löwis, Tony Lownds, and others.

PEP 3118: Revised Buffer Protocol

The buffer protocol is a C-level API that lets Python types exchange pointers into their internal representations. A memory-mapped file can be viewed as a buffer of characters, for example, and this lets another module such as retreat memory-mapped files as a string of characters to be searched.

The primary users of the buffer protocol are numeric-processing packages such as NumPy, which expose the internal representation of arrays so that callers can write data directly into an array instead of going through a slower API. This PEP updates the buffer protocol in light of experience from NumPy development, adding a number of new features such as indicating the shape of an array or locking a memory region.

The most important new C API function isPyObject_GetBuffer(PyObject *obj, Py_buffer *view, int flags), which takes an object and a set of flags, and fills in thePy_buffer structure with information about the object’s memory representation. Objects can use this operation to lock memory in place while an external caller could be modifying the contents, so there’s a corresponding PyBuffer_Release(Py_buffer *view) to indicate that the external caller is done.

The flags argument to PyObject_GetBuffer() specifies constraints upon the memory returned. Some examples are:

Two new argument codes for PyArg_ParseTuple(),s* and z*, return locked buffer objects for a parameter.

See also

PEP 3118 - Revising the buffer protocol

PEP written by Travis Oliphant and Carl Banks; implemented by Travis Oliphant.

PEP 3119: Abstract Base Classes

Some object-oriented languages such as Java support interfaces, declaring that a class has a given set of methods or supports a given access protocol. Abstract Base Classes (or ABCs) are an equivalent feature for Python. The ABC support consists of an abc module containing a metaclass called ABCMeta, special handling of this metaclass by the isinstance() and issubclass()builtins, and a collection of basic ABCs that the Python developers think will be widely useful. Future versions of Python will probably add more ABCs.

Let’s say you have a particular class and wish to know whether it supports dictionary-style access. The phrase “dictionary-style” is vague, however. It probably means that accessing items with obj[1] works. Does it imply that setting items with obj[2] = value works? Or that the object will have keys(), values(), and items()methods? What about the iterative variants such as iterkeys()? copy()and update()? Iterating over the object with iter()?

The Python 2.6 collections module includes a number of different ABCs that represent these distinctions. Iterableindicates that a class defines __iter__(), andContainer means the class defines a __contains__()method and therefore supports x in y expressions. The basic dictionary interface of getting items, setting items, andkeys(), values(), and items(), is defined by theMutableMapping ABC.

You can derive your own classes from a particular ABC to indicate they support that ABC’s interface:

import collections

class Storage(collections.MutableMapping): ...

Alternatively, you could write the class without deriving from the desired ABC and instead register the class by calling the ABC’s register() method:

import collections

class Storage: ...

collections.MutableMapping.register(Storage)

For classes that you write, deriving from the ABC is probably clearer. The register() method is useful when you’ve written a new ABC that can describe an existing type or class, or if you want to declare that some third-party class implements an ABC. For example, if you defined a PrintableType ABC, it’s legal to do:

Register Python's types

PrintableType.register(int) PrintableType.register(float) PrintableType.register(str)

Classes should obey the semantics specified by an ABC, but Python can’t check this; it’s up to the class author to understand the ABC’s requirements and to implement the code accordingly.

To check whether an object supports a particular interface, you can now write:

def func(d): if not isinstance(d, collections.MutableMapping): raise ValueError("Mapping object expected, not %r" % d)

Don’t feel that you must now begin writing lots of checks as in the above example. Python has a strong tradition of duck-typing, where explicit type-checking is never done and code simply calls methods on an object, trusting that those methods will be there and raising an exception if they aren’t. Be judicious in checking for ABCs and only do it where it’s absolutely necessary.

You can write your own ABCs by using abc.ABCMeta as the metaclass in a class definition:

from abc import ABCMeta, abstractmethod

class Drawable(): metaclass = ABCMeta

@abstractmethod
def draw(self, x, y, scale=1.0):
    pass

def draw_doubled(self, x, y):
    self.draw(x, y, scale=2.0)

class Square(Drawable): def draw(self, x, y, scale): ...

In the Drawable ABC above, the draw_doubled() method renders the object at twice its size and can be implemented in terms of other methods described in Drawable. Classes implementing this ABC therefore don’t need to provide their own implementation of draw_doubled(), though they can do so. An implementation of draw() is necessary, though; the ABC can’t provide a useful generic implementation.

You can apply the @abstractmethod decorator to methods such asdraw() that must be implemented; Python will then raise an exception for classes that don’t define the method. Note that the exception is only raised when you actually try to create an instance of a subclass lacking the method:

class Circle(Drawable): ... pass ... c = Circle() Traceback (most recent call last): File "", line 1, in TypeError: Can't instantiate abstract class Circle with abstract methods draw

Abstract data attributes can be declared using the@abstractproperty decorator:

from abc import abstractproperty ...

@abstractproperty def readonly(self): return self._x

Subclasses must then define a readonly() property.

See also

PEP 3119 - Introducing Abstract Base Classes

PEP written by Guido van Rossum and Talin. Implemented by Guido van Rossum. Backported to 2.6 by Benjamin Aranguren, with Alex Martelli.

PEP 3127: Integer Literal Support and Syntax

Python 3.0 changes the syntax for octal (base-8) integer literals, prefixing them with “0o” or “0O” instead of a leading zero, and adds support for binary (base-2) integer literals, signalled by a “0b” or “0B” prefix.

Python 2.6 doesn’t drop support for a leading 0 signalling an octal number, but it does add support for “0o” and “0b”:

0o21, 2*8 + 1 (17, 17) 0b101111 47

The oct() builtin still returns numbers prefixed with a leading zero, and a new bin()builtin returns the binary representation for a number:

oct(42) '052' future_builtins.oct(42) '0o52' bin(173) '0b10101101'

The int() and long() builtins will now accept the “0o” and “0b” prefixes when base-8 or base-2 are requested, or when the_base_ argument is zero (signalling that the base used should be determined from the string):

int ('0o52', 0) 42 int('1101', 2) 13 int('0b1101', 2) 13 int('0b1101', 0) 13

See also

PEP 3127 - Integer Literal Support and Syntax

PEP written by Patrick Maupin; backported to 2.6 by Eric Smith.

PEP 3129: Class Decorators

Decorators have been extended from functions to classes. It’s now legal to write:

This is equivalent to:

class A: pass

A = foo(bar(A))

See also

PEP 3129 - Class Decorators

PEP written by Collin Winter.

PEP 3141: A Type Hierarchy for Numbers

Python 3.0 adds several abstract base classes for numeric types inspired by Scheme’s numeric tower. These classes were backported to 2.6 as the numbers module.

The most general ABC is Number. It defines no operations at all, and only exists to allow checking if an object is a number by doing isinstance(obj, Number).

Complex is a subclass of Number. Complex numbers can undergo the basic operations of addition, subtraction, multiplication, division, and exponentiation, and you can retrieve the real and imaginary parts and obtain a number’s conjugate. Python’s built-in complex type is an implementation of Complex.

Real further derives from Complex, and adds operations that only work on real numbers: floor(), trunc(), rounding, taking the remainder mod N, floor division, and comparisons.

Rational numbers derive from Real, havenumerator and denominator properties, and can be converted to floats. Python 2.6 adds a simple rational-number class,Fraction, in the fractions module. (It’s calledFraction instead of Rational to avoid a name clash with numbers.Rational.)

Integral numbers derive from Rational, and can be shifted left and right with << and >>, combined using bitwise operations such as & and |, and can be used as array indexes and slice boundaries.

In Python 3.0, the PEP slightly redefines the existing builtinsround(), math.floor(), math.ceil(), and adds a new one, math.trunc(), that’s been backported to Python 2.6.math.trunc() rounds toward zero, returning the closestIntegral that’s between the function’s argument and zero.

The fractions Module

To fill out the hierarchy of numeric types, the fractionsmodule provides a rational-number class. Rational numbers store their values as a numerator and denominator forming a fraction, and can exactly represent numbers such as 2/3 that floating-point numbers can only approximate.

The Fraction constructor takes two Integral values that will be the numerator and denominator of the resulting fraction.

from fractions import Fraction a = Fraction(2, 3) b = Fraction(2, 5) float(a), float(b) (0.66666666666666663, 0.40000000000000002) a+b Fraction(16, 15) a/b Fraction(5, 3)

For converting floating-point numbers to rationals, the float type now has an as_integer_ratio() method that returns the numerator and denominator for a fraction that evaluates to the same floating-point value:

(2.5) .as_integer_ratio() (5, 2) (3.1415) .as_integer_ratio() (7074029114692207L, 2251799813685248L) (1./3) .as_integer_ratio() (6004799503160661L, 18014398509481984L)

Note that values that can only be approximated by floating-point numbers, such as 1./3, are not simplified to the number being approximated; the fraction attempts to match the floating-point valueexactly.

The fractions module is based upon an implementation by Sjoerd Mullender that was in Python’s Demo/classes/ directory for a long time. This implementation was significantly updated by Jeffrey Yasskin.

Other Language Changes

Some smaller changes made to the core Python language are:

class D(C):
@C.x.getter
def x(self):
return self._x * 2
@x.setter
def x(self, value):
self._x = value / 2

Optimizations

Interpreter Changes

Two command-line options have been reserved for use by other Python implementations. The -J switch has been reserved for use by Jython for Jython-specific options, such as switches that are passed to the underlying JVM. -X has been reserved for options specific to a particular implementation of Python such as CPython, Jython, or IronPython. If either option is used with Python 2.6, the interpreter will report that the option isn’t currently used.

Python can now be prevented from writing .pyc or .pyofiles by supplying the -B switch to the Python interpreter, or by setting the PYTHONDONTWRITEBYTECODE environment variable before running the interpreter. This setting is available to Python programs as the sys.dont_write_bytecode variable, and Python code can change the value to modify the interpreter’s behaviour. (Contributed by Neal Norwitz and Georg Brandl.)

The encoding used for standard input, output, and standard error can be specified by setting the PYTHONIOENCODING environment variable before running the interpreter. The value should be a string in the form <encoding> or <encoding>:<errorhandler>. The encoding part specifies the encoding’s name, e.g. utf-8 orlatin-1; the optional errorhandler part specifies what to do with characters that can’t be handled by the encoding, and should be one of “error”, “ignore”, or “replace”. (Contributed by Martin von Löwis.)

New and Improved Modules

As in every release, Python’s standard library received a number of enhancements and bug fixes. Here’s a partial list of the most notable changes, sorted alphabetically by module name. Consult theMisc/NEWS file in the source tree for a more complete list of changes, or look through the Subversion logs for all the details.

Boldface text starting at y=0,x=21

and affecting the rest of the line.

stdscr.chgat(0, 21, curses.A_BOLD)
The Textbox class in the curses.textpad module now supports editing in insert mode as well as overwrite mode. Insert mode is enabled by supplying a true value for the _insert_mode_parameter when creating the Textbox instance.

...
(Contributed by Paul Moore; bpo-2439.)

(Contributed by Tarek Ziadé; bpo-2663.)

The new encoding and errors parameters specify an encoding and an error handling scheme for character conversions. 'strict','ignore', and 'replace' are the three standard ways Python can handle errors,;'utf-8' is a special value that replaces bad characters with their UTF-8 representation. (Character conversions occur because the PAX format supports Unicode filenames, defaulting to UTF-8 encoding.)
The TarFile.add() method now accepts an exclude argument that’s a function that can be used to exclude certain filenames from an archive. The function must take a filename and return true if the file should be excluded or false if it should be archived. The function is applied to both the name initially passed to add()and to the names of files in recursively added directories.
(All changes contributed by Lars Gustäbel).

whitespace.

print textwrap.fill(S, drop_whitespace=False, width=15)
This sentence
has a bunch
of extra
whitespace.

(Contributed by Dwayne Bailey; bpo-1581073.)

Traceback (most recent call last):
...
urllib2.URLError:

(Added by Facundo Batista.)

Unpack a single file, writing it relative

to the /tmp directory.

z.extract('Python/sysmodule.c', '/tmp')

Unpack all the files in the archive.

z.extractall()
(Contributed by Alan McIntyre; bpo-467924.)
The open(), read() and extract() methods can now take either a filename or a ZipInfo object. This is useful when an archive accidentally contains a duplicated filename. (Contributed by Graham Horler; bpo-1775025.)
Finally, zipfile now supports using Unicode filenames for archived files. (Contributed by Alexey Borzenkov; bpo-1734346.)

The ast module

The ast module provides an Abstract Syntax Tree representation of Python code, and Armin Ronacher contributed a set of helper functions that perform a variety of common tasks. These will be useful for HTML templating packages, code analyzers, and similar tools that process Python code.

The parse() function takes an expression and returns an AST. The dump() function outputs a representation of a tree, suitable for debugging:

import ast

t = ast.parse(""" d = {} for i in 'abcdefghijklm': d[i + i] = ord(i) - ord('a') + 1 print d """) print ast.dump(t)

This outputs a deeply nested tree:

Module(body=[ Assign(targets=[ Name(id='d', ctx=Store()) ], value=Dict(keys=[], values=[])) For(target=Name(id='i', ctx=Store()), iter=Str(s='abcdefghijklm'), body=[ Assign(targets=[ Subscript(value= Name(id='d', ctx=Load()), slice= Index(value= BinOp(left=Name(id='i', ctx=Load()), op=Add(), right=Name(id='i', ctx=Load()))), ctx=Store()) ], value= BinOp(left= BinOp(left= Call(func= Name(id='ord', ctx=Load()), args=[ Name(id='i', ctx=Load()) ], keywords=[], starargs=None, kwargs=None), op=Sub(), right=Call(func= Name(id='ord', ctx=Load()), args=[ Str(s='a') ], keywords=[], starargs=None, kwargs=None)), op=Add(), right=Num(n=1))) ], orelse=[]) Print(dest=None, values=[ Name(id='d', ctx=Load()) ], nl=True) ])

The literal_eval() method takes a string or an AST representing a literal expression, parses and evaluates it, and returns the resulting value. A literal expression is a Python expression containing only strings, numbers, dictionaries, etc. but no statements or function calls. If you need to evaluate an expression but cannot accept the security risk of using aneval() call, literal_eval() will handle it safely:

literal = '("a", "b", {2:4, 3:8, 1:2})' print ast.literal_eval(literal) ('a', 'b', {1: 2, 2: 4, 3: 8}) print ast.literal_eval('"a" + "b"') Traceback (most recent call last): ... ValueError: malformed string

The module also includes NodeVisitor andNodeTransformer classes for traversing and modifying an AST, and functions for common transformations such as changing line numbers.

The future_builtins module

Python 3.0 makes many changes to the repertoire of built-in functions, and most of the changes can’t be introduced in the Python 2.x series because they would break compatibility. The future_builtins module provides versions of these built-in functions that can be imported when writing 3.0-compatible code.

The functions in this module currently include:

The json module: JavaScript Object Notation

The new json module supports the encoding and decoding of Python types in JSON (Javascript Object Notation). JSON is a lightweight interchange format often used in web applications. For more information about JSON, seehttp://www.json.org.

json comes with support for decoding and encoding most built-in Python types. The following example encodes and decodes a dictionary:

import json data = {"spam": "foo", "parrot": 42} in_json = json.dumps(data) # Encode the data in_json '{"parrot": 42, "spam": "foo"}' json.loads(in_json) # Decode into a Python object {"spam": "foo", "parrot": 42}

It’s also possible to write your own decoders and encoders to support more types. Pretty-printing of the JSON strings is also supported.

json (originally called simplejson) was written by Bob Ippolito.

The plistlib module: A Property-List Parser

The .plist format is commonly used on Mac OS X to store basic data types (numbers, strings, lists, and dictionaries) by serializing them into an XML-based format. It resembles the XML-RPC serialization of data types.

Despite being primarily used on Mac OS X, the format has nothing Mac-specific about it and the Python implementation works on any platform that Python supports, so the plistlib module has been promoted to the standard library.

Using the module is simple:

import sys import plistlib import datetime

Create data structure

data_struct = dict(lastAccessed=datetime.datetime.now(), version=1, categories=('Personal','Shared','Private'))

Create string containing XML.

plist_str = plistlib.writePlistToString(data_struct) new_struct = plistlib.readPlistFromString(plist_str) print data_struct print new_struct

Write data structure to a file and read it back.

plistlib.writePlist(data_struct, '/tmp/customizations.plist') new_struct = plistlib.readPlist('/tmp/customizations.plist')

read/writePlist accepts file-like objects as well as paths.

plistlib.writePlist(data_struct, sys.stdout)

ctypes Enhancements

Thomas Heller continued to maintain and enhance thectypes module.

ctypes now supports a c_bool datatype that represents the C99 bool type. (Contributed by David Remahl;bpo-1649190.)

The ctypes string, buffer and array types have improved support for extended slicing syntax, where various combinations of (start, stop, step) are supplied. (Implemented by Thomas Wouters.)

All ctypes data types now supportfrom_buffer() and from_buffer_copy()methods that create a ctypes instance based on a provided buffer object. from_buffer_copy() copies the contents of the object, while from_buffer() will share the same memory area.

A new calling convention tells ctypes to clear the errno or Win32 LastError variables at the outset of each wrapped call. (Implemented by Thomas Heller; bpo-1798.)

You can now retrieve the Unix errno variable after a function call. When creating a wrapped function, you can supplyuse_errno=True as a keyword parameter to the DLL() function and then call the module-level methods set_errno() andget_errno() to set and retrieve the error value.

The Win32 LastError variable is similarly supported by the DLL(), OleDLL(), and WinDLL() functions. You supply use_last_error=True as a keyword parameter and then call the module-level methods set_last_error()and get_last_error().

The byref() function, used to retrieve a pointer to a ctypes instance, now has an optional offset parameter that is a byte count that will be added to the returned pointer.

Improved SSL Support

Bill Janssen made extensive improvements to Python 2.6’s support for the Secure Sockets Layer by adding a new module, ssl, that’s built atop the OpenSSL library. This new module provides more control over the protocol negotiated, the X.509 certificates used, and has better support for writing SSL servers (as opposed to clients) in Python. The existing SSL support in the socket module hasn’t been removed and continues to work, though it will be removed in Python 3.0.

To use the new module, you must first create a TCP connection in the usual way and then pass it to the ssl.wrap_socket() function. It’s possible to specify whether a certificate is required, and to obtain certificate info by calling the getpeercert() method.

See also

The documentation for the ssl module.

Deprecations and Removals

Build and C API Changes

Changes to Python’s build process and to the C API include:

Port-Specific Changes: Windows

Port-Specific Changes: Mac OS X

Port-Specific Changes: IRIX

A number of old IRIX-specific modules were deprecated and will be removed in Python 3.0:al and AL,cd,cddb,cdplayer,CL and cl,DEVICE,ERRNO,FILE,FL and fl,flp,fm,GET,GLWS,GL and gl,IN,IOCTL,jpeg,panelparser,readcd,SV and sv,torgb,videoreader, andWAIT.

Porting to Python 2.6

This section lists previously described changes and other bugfixes that may require changes to your code:

For applications that embed Python:

Acknowledgements

The author would like to thank the following people for offering suggestions, corrections and assistance with various drafts of this article: Georg Brandl, Steve Brown, Nick Coghlan, Ralph Corderoy, Jim Jewett, Kent Johnson, Chris Lambacher, Martin Michlmayr, Antoine Pitrou, Brian Warner.