What’s New in Python 2.5 (original) (raw)

Author:

A.M. Kuchling

This article explains the new features in Python 2.5. The final release of Python 2.5 is scheduled for August 2006; PEP 356 describes the planned release schedule. Python 2.5 was released on September 19, 2006.

The changes in Python 2.5 are an interesting mix of language and library improvements. The library enhancements will be more important to Python’s user community, I think, because several widely useful packages were added. New modules include ElementTree for XML processing (xml.etree), the SQLite database module (sqlite), and the ctypesmodule for calling C functions.

The language changes are of middling significance. Some pleasant new features were added, but most of them aren’t features that you’ll use every day. Conditional expressions were finally added to the language using a novel syntax; see section PEP 308: Conditional Expressions. The new ‘with’ statement will make writing cleanup code easier (section PEP 343: The ‘with’ statement). Values can now be passed into generators (section PEP 342: New Generator Features). Imports are now visible as either absolute or relative (section PEP 328: Absolute and Relative Imports). Some corner cases of exception handling are handled better (section PEP 341: Unified try/except/finally). All these improvements are worthwhile, but they’re improvements to one specific language feature or another; none of them are broad modifications to Python’s semantics.

As well as the language and library additions, other improvements and bugfixes were made throughout the source tree. A search through the SVN change logs finds there were 353 patches applied and 458 bugs fixed between Python 2.4 and 2.5. (Both figures are likely to be underestimates.)

This article doesn’t try to be a complete specification of the new features; instead changes are briefly introduced using helpful examples. For full details, you should always refer to the documentation for Python 2.5 athttps://docs.python.org. If you want to understand the complete implementation and design rationale, refer to the PEP for a particular new feature.

Comments, suggestions, and error reports for this document are welcome; please e-mail them to the author or open a bug in the Python bug tracker.

PEP 308: Conditional Expressions

For a long time, people have been requesting a way to write conditional expressions, which are expressions that return value A or value B depending on whether a Boolean value is true or false. A conditional expression lets you write a single assignment statement that has the same effect as the following:

if condition: x = true_value else: x = false_value

There have been endless tedious discussions of syntax on both python-dev and comp.lang.python. A vote was even held that found the majority of voters wanted conditional expressions in some form, but there was no syntax that was preferred by a clear majority. Candidates included C’s cond ? true_v : false_v, if cond then true_v else false_v, and 16 other variations.

Guido van Rossum eventually chose a surprising syntax:

x = true_value if condition else false_value

Evaluation is still lazy as in existing Boolean expressions, so the order of evaluation jumps around a bit. The condition expression in the middle is evaluated first, and the true_value expression is evaluated only if the condition was true. Similarly, the false_value expression is only evaluated when the condition is false.

This syntax may seem strange and backwards; why does the condition go in the_middle_ of the expression, and not in the front as in C’s c ? x : y? The decision was checked by applying the new syntax to the modules in the standard library and seeing how the resulting code read. In many cases where a conditional expression is used, one value seems to be the ‘common case’ and one value is an ‘exceptional case’, used only on rarer occasions when the condition isn’t met. The conditional syntax makes this pattern a bit more obvious:

contents = ((doc + '\n') if doc else '')

I read the above statement as meaning “here contents is usually assigned a value of doc+'\n'; sometimes doc is empty, in which special case an empty string is returned.” I doubt I will use conditional expressions very often where there isn’t a clear common and uncommon case.

There was some discussion of whether the language should require surrounding conditional expressions with parentheses. The decision was made to _not_require parentheses in the Python language’s grammar, but as a matter of style I think you should always use them. Consider these two statements:

First version -- no parens

level = 1 if logging else 0

Second version -- with parens

level = (1 if logging else 0)

In the first version, I think a reader’s eye might group the statement into ‘level = 1’, ‘if logging’, ‘else 0’, and think that the condition decides whether the assignment to level is performed. The second version reads better, in my opinion, because it makes it clear that the assignment is always performed and the choice is being made between two values.

Another reason for including the brackets: a few odd combinations of list comprehensions and lambdas could look like incorrect conditional expressions. See PEP 308 for some examples. If you put parentheses around your conditional expressions, you won’t run into this case.

See also

PEP 308 - Conditional Expressions

PEP written by Guido van Rossum and Raymond D. Hettinger; implemented by Thomas Wouters.

PEP 309: Partial Function Application

The functools module is intended to contain tools for functional-style programming.

One useful tool in this module is the partial() function. For programs written in a functional style, you’ll sometimes want to construct variants of existing functions that have some of the parameters filled in. Consider a Python function f(a, b, c); you could create a new function g(b, c) that was equivalent to f(1, b, c). This is called “partial function application”.

partial() takes the arguments (function, arg1, arg2, ... kwarg1=value1, kwarg2=value2). The resulting object is callable, so you can just call it to invoke function with the filled-in arguments.

Here’s a small but realistic example:

import functools

def log (message, subsystem): "Write the contents of 'message' to the specified subsystem." print '%s: %s' % (subsystem, message) ...

server_log = functools.partial(log, subsystem='server') server_log('Unable to open socket')

Here’s another example, from a program that uses PyGTK. Here a context-sensitive pop-up menu is being constructed dynamically. The callback provided for the menu option is a partially applied version of the open_item()method, where the first argument has been provided.

... class Application: def open_item(self, path): ... def init (self): open_func = functools.partial(self.open_item, item_path) popup_menu.append( ("Open", open_func, 1) )

Another function in the functools module is theupdate_wrapper(wrapper, wrapped) function that helps you write well-behaved decorators. update_wrapper() copies the name, module, and docstring attribute to a wrapper function so that tracebacks inside the wrapped function are easier to understand. For example, you might write:

def my_decorator(f): def wrapper(*args, **kwds): print 'Calling decorated function' return f(*args, **kwds) functools.update_wrapper(wrapper, f) return wrapper

wraps() is a decorator that can be used inside your own decorators to copy the wrapped function’s information. An alternate version of the previous example would be:

def my_decorator(f): @functools.wraps(f) def wrapper(*args, **kwds): print 'Calling decorated function' return f(*args, **kwds) return wrapper

See also

PEP 309 - Partial Function Application

PEP proposed and written by Peter Harris; implemented by Hye-Shik Chang and Nick Coghlan, with adaptations by Raymond Hettinger.

PEP 314: Metadata for Python Software Packages v1.1

Some simple dependency support was added to Distutils. The setup()function now has requires, provides, and obsoletes keyword parameters. When you build a source distribution using the sdist command, the dependency information will be recorded in the PKG-INFO file.

Another new keyword parameter is download_url, which should be set to a URL for the package’s source code. This means it’s now possible to look up an entry in the package index, determine the dependencies for a package, and download the required packages.

VERSION = '1.0' setup(name='PyPackage', version=VERSION, requires=['numarray', 'zlib (>=1.1.4)'], obsoletes=['OldPackage'] download_url=('http://www.example.com/pypackage/dist/pkg-%s.tar.gz' % VERSION), )

Another new enhancement to the Python package index athttps://pypi.org is storing source and binary archives for a package. The new upload Distutils command will upload a package to the repository.

Before a package can be uploaded, you must be able to build a distribution using the sdist Distutils command. Once that works, you can run python setup.py upload to add your package to the PyPI archive. Optionally you can GPG-sign the package by supplying the --sign and --identityoptions.

Package uploading was implemented by Martin von Löwis and Richard Jones.

See also

PEP 314 - Metadata for Python Software Packages v1.1

PEP proposed and written by A.M. Kuchling, Richard Jones, and Fred Drake; implemented by Richard Jones and Fred Drake.

PEP 328: Absolute and Relative Imports

The simpler part of PEP 328 was implemented in Python 2.4: parentheses could now be used to enclose the names imported from a module using the from ... import ... statement, making it easier to import many different names.

The more complicated part has been implemented in Python 2.5: importing a module can be specified to use absolute or package-relative imports. The plan is to move toward making absolute imports the default in future versions of Python.

Let’s say you have a package directory like this:

pkg/ pkg/init.py pkg/main.py pkg/string.py

This defines a package named pkg containing the pkg.main andpkg.string submodules.

Consider the code in the main.py module. What happens if it executes the statement import string? In Python 2.4 and earlier, it will first look in the package’s directory to perform a relative import, findspkg/string.py, imports the contents of that file as thepkg.string module, and that module is bound to the name string in thepkg.main module’s namespace.

That’s fine if pkg.string was what you wanted. But what if you wanted Python’s standard string module? There’s no clean way to ignorepkg.string and look for the standard module; generally you had to look at the contents of sys.modules, which is slightly unclean. Holger Krekel’spy.std package provides a tidier way to perform imports from the standard library, import py; py.std.string.join(), but that package isn’t available on all Python installations.

Reading code which relies on relative imports is also less clear, because a reader may be confused about which module, string or pkg.string, is intended to be used. Python users soon learned not to duplicate the names of standard library modules in the names of their packages’ submodules, but you can’t protect against having your submodule’s name being used for a new module added in a future version of Python.

In Python 2.5, you can switch import’s behaviour to absolute imports using a from __future__ import absolute_import directive. This absolute-import behaviour will become the default in a future version (probably Python 2.7). Once absolute imports are the default, import string will always find the standard library’s version. It’s suggested that users should begin using absolute imports as much as possible, so it’s preferable to begin writingfrom pkg import string in your code.

Relative imports are still possible by adding a leading period to the module name when using the from ... import form:

Import names from pkg.string

from .string import name1, name2

Import pkg.string

from . import string

This imports the string module relative to the current package, so inpkg.main this will import name1 and name2 from pkg.string. Additional leading periods perform the relative import starting from the parent of the current package. For example, code in the A.B.C module can do:

from . import D # Imports A.B.D from .. import E # Imports A.E from ..F import G # Imports A.F.G

Leading periods cannot be used with the import modname form of the import statement, only the from ... import form.

See also

PEP 328 - Imports: Multi-Line and Absolute/Relative

PEP written by Aahz; implemented by Thomas Wouters.

https://pylib.readthedocs.io/

The py library by Holger Krekel, which contains the py.std package.

PEP 338: Executing Modules as Scripts

The -m switch added in Python 2.4 to execute a module as a script gained a few more abilities. Instead of being implemented in C code inside the Python interpreter, the switch now uses an implementation in a new module,runpy.

The runpy module implements a more sophisticated import mechanism so that it’s now possible to run modules in a package such as pychecker.checker. The module also supports alternative import mechanisms such as thezipimport module. This means you can add a .zip archive’s path tosys.path and then use the -m switch to execute code from the archive.

See also

PEP 338 - Executing modules as scripts

PEP written and implemented by Nick Coghlan.

PEP 341: Unified try/except/finally

Until Python 2.5, the try statement came in two flavours. You could use a finally block to ensure that code is always executed, or one or more except blocks to catch specific exceptions. You couldn’t combine both except blocks and a finally block, because generating the right bytecode for the combined version was complicated and it wasn’t clear what the semantics of the combined statement should be.

Guido van Rossum spent some time working with Java, which does support the equivalent of combining except blocks and a finally block, and this clarified what the statement should mean. In Python 2.5, you can now write:

try: block-1 ... except Exception1: handler-1 ... except Exception2: handler-2 ... else: else-block finally: final-block

The code in block-1 is executed. If the code raises an exception, the variousexcept blocks are tested: if the exception is of classException1, handler-1 is executed; otherwise if it’s of classException2, handler-2 is executed, and so forth. If no exception is raised, the else-block is executed.

No matter what happened previously, the final-block is executed once the code block is complete and any raised exceptions handled. Even if there’s an error in an exception handler or the else-block and a new exception is raised, the code in the final-block is still run.

See also

PEP 341 - Unifying try-except and try-finally

PEP written by Georg Brandl; implementation by Thomas Lee.

PEP 342: New Generator Features

Python 2.5 adds a simple way to pass values into a generator. As introduced in Python 2.3, generators only produce output; once a generator’s code was invoked to create an iterator, there was no way to pass any new information into the function when its execution is resumed. Sometimes the ability to pass in some information would be useful. Hackish solutions to this include making the generator’s code look at a global variable and then changing the global variable’s value, or passing in some mutable object that callers then modify.

To refresh your memory of basic generators, here’s a simple example:

def counter (maximum): i = 0 while i < maximum: yield i i += 1

When you call counter(10), the result is an iterator that returns the values from 0 up to 9. On encountering the yield statement, the iterator returns the provided value and suspends the function’s execution, preserving the local variables. Execution resumes on the following call to the iterator’snext() method, picking up after the yield statement.

In Python 2.3, yield was a statement; it didn’t return any value. In 2.5, yield is now an expression, returning a value that can be assigned to a variable or otherwise operated on:

I recommend that you always put parentheses around a yield expression when you’re doing something with the returned value, as in the above example. The parentheses aren’t always necessary, but it’s easier to always add them instead of having to remember when they’re needed.

(PEP 342 explains the exact rules, which are that ayield-expression must always be parenthesized except when it occurs at the top-level expression on the right-hand side of an assignment. This means you can writeval = yield i but have to use parentheses when there’s an operation, as inval = (yield i) + 12.)

Values are sent into a generator by calling its send(value) method. The generator’s code is then resumed and the yield expression returns the specified value. If the regular next() method is called, theyield returns None.

Here’s the previous example, modified to allow changing the value of the internal counter.

def counter (maximum): i = 0 while i < maximum: val = (yield i) # If value provided, change counter if val is not None: i = val else: i += 1

And here’s an example of changing the counter:

it = counter(10) print it.next() 0 print it.next() 1 print it.send(8) 8 print it.next() 9 print it.next() Traceback (most recent call last): File "t.py", line 15, in ? print it.next() StopIteration

yield will usually return None, so you should always check for this case. Don’t just use its value in expressions unless you’re sure that the send() method will be the only method used to resume your generator function.

In addition to send(), there are two other new methods on generators:

The cumulative effect of these changes is to turn generators from one-way producers of information into both producers and consumers.

Generators also become coroutines, a more generalized form of subroutines. Subroutines are entered at one point and exited at another point (the top of the function, and a return statement), but coroutines can be entered, exited, and resumed at many different points (the yield statements). We’ll have to figure out patterns for using coroutines effectively in Python.

The addition of the close() method has one side effect that isn’t obvious.close() is called when a generator is garbage-collected, so this means the generator’s code gets one last chance to run before the generator is destroyed. This last chance means that try...finally statements in generators can now be guaranteed to work; the finally clause will now always get a chance to run. The syntactic restriction that you couldn’t mix yieldstatements with a try...finally suite has therefore been removed. This seems like a minor bit of language trivia, but using generators andtry...finally is actually necessary in order to implement thewith statement described by PEP 343. I’ll look at this new statement in the following section.

Another even more esoteric effect of this change: previously, thegi_frame attribute of a generator was always a frame object. It’s now possible for gi_frame to be None once the generator has been exhausted.

PEP 343: The ‘with’ statement

The ‘with’ statement clarifies code that previously would usetry...finally blocks to ensure that clean-up code is executed. In this section, I’ll discuss the statement as it will commonly be used. In the next section, I’ll examine the implementation details and show how to write objects for use with this statement.

The ‘with’ statement is a new control-flow structure whose basic structure is:

with expression [as variable]: with-block

The expression is evaluated, and it should result in an object that supports the context management protocol (that is, has __enter__() and __exit__()methods.

The object’s __enter__() is called before with-block is executed and therefore can run set-up code. It also may return a value that is bound to the name variable, if given. (Note carefully that variable is not assigned the result of expression.)

After execution of the with-block is finished, the object’s __exit__()method is called, even if the block raised an exception, and can therefore run clean-up code.

To enable the statement in Python 2.5, you need to add the following directive to your module:

from future import with_statement

The statement will always be enabled in Python 2.6.

Some standard Python objects now support the context management protocol and can be used with the ‘with’ statement. File objects are one example:

with open('/etc/passwd', 'r') as f: for line in f: print line ... more processing code ...

After this statement has executed, the file object in f will have been automatically closed, even if the for loop raised an exception part-way through the block.

Note

In this case, f is the same object created by open(), because__enter__() returns self.

The threading module’s locks and condition variables also support the ‘with’ statement:

lock = threading.Lock() with lock: # Critical section of code ...

The lock is acquired before the block is executed and always released once the block is complete.

The new localcontext() function in the decimal module makes it easy to save and restore the current decimal context, which encapsulates the desired precision and rounding characteristics for computations:

from decimal import Decimal, Context, localcontext

Displays with default precision of 28 digits

v = Decimal('578') print v.sqrt()

with localcontext(Context(prec=16)): # All code in this block uses a precision of 16 digits. # The original context is restored on exiting the block. print v.sqrt()

Writing Context Managers

Under the hood, the ‘with’ statement is fairly complicated. Most people will only use ‘with’ in company with existing objects and don’t need to know these details, so you can skip the rest of this section if you like. Authors of new objects will need to understand the details of the underlying implementation and should keep reading.

A high-level explanation of the context management protocol is:

Let’s think through an example. I won’t present detailed code but will only sketch the methods necessary for a database that supports transactions.

(For people unfamiliar with database terminology: a set of changes to the database are grouped into a transaction. Transactions can be either committed, meaning that all the changes are written into the database, or rolled back, meaning that the changes are all discarded and the database is unchanged. See any database textbook for more information.)

Let’s assume there’s an object representing a database connection. Our goal will be to let the user write code like this:

db_connection = DatabaseConnection() with db_connection as cursor: cursor.execute('insert into ...') cursor.execute('delete from ...') # ... more operations ...

The transaction should be committed if the code in the block runs flawlessly or rolled back if there’s an exception. Here’s the basic interface forDatabaseConnection that I’ll assume:

class DatabaseConnection: # Database interface def cursor (self): "Returns a cursor object and starts a new transaction" def commit (self): "Commits current transaction" def rollback (self): "Rolls back current transaction"

The __enter__() method is pretty easy, having only to start a new transaction. For this application the resulting cursor object would be a useful result, so the method will return it. The user can then add as cursor to their ‘with’ statement to bind the cursor to a variable name.

class DatabaseConnection: ... def enter (self): # Code to start a new transaction cursor = self.cursor() return cursor

The __exit__() method is the most complicated because it’s where most of the work has to be done. The method has to check if an exception occurred. If there was no exception, the transaction is committed. The transaction is rolled back if there was an exception.

In the code below, execution will just fall off the end of the function, returning the default value of None. None is false, so the exception will be re-raised automatically. If you wished, you could be more explicit and add a return statement at the marked location.

class DatabaseConnection: ... def exit (self, type, value, tb): if tb is None: # No exception, so commit self.commit() else: # Exception occurred, so rollback. self.rollback() # return False

The contextlib module

The new contextlib module provides some functions and a decorator that are useful for writing objects for use with the ‘with’ statement.

The decorator is called contextmanager(), and lets you write a single generator function instead of defining a new class. The generator should yield exactly one value. The code up to the yield will be executed as the__enter__() method, and the value yielded will be the method’s return value that will get bound to the variable in the ‘with’ statement’sas clause, if any. The code after the yield will be executed in the __exit__() method. Any exception raised in the block will be raised by the yield statement.

Our database example from the previous section could be written using this decorator as:

from contextlib import contextmanager

@contextmanager def db_transaction (connection): cursor = connection.cursor() try: yield cursor except: connection.rollback() raise else: connection.commit()

db = DatabaseConnection() with db_transaction(db) as cursor: ...

The contextlib module also has a nested(mgr1, mgr2, ...) function that combines a number of context managers so you don’t need to write nested ‘with’ statements. In this example, the single ‘with’ statement both starts a database transaction and acquires a thread lock:

lock = threading.Lock() with nested (db_transaction(db), lock) as (cursor, locked): ...

Finally, the closing(object) function returns object so that it can be bound to a variable, and calls object.close at the end of the block.

import urllib, sys from contextlib import closing

with closing(urllib.urlopen('http://www.yahoo.com')) as f: for line in f: sys.stdout.write(line)

See also

PEP 343 - The “with” statement

PEP written by Guido van Rossum and Nick Coghlan; implemented by Mike Bland, Guido van Rossum, and Neal Norwitz. The PEP shows the code generated for a ‘with’ statement, which can be helpful in learning how the statement works.

The documentation for the contextlib module.

PEP 352: Exceptions as New-Style Classes

Exception classes can now be new-style classes, not just classic classes, and the built-in Exception class and all the standard built-in exceptions (NameError, ValueError, etc.) are now new-style classes.

The inheritance hierarchy for exceptions has been rearranged a bit. In 2.5, the inheritance relationships are:

BaseException # New in Python 2.5 |- KeyboardInterrupt |- SystemExit |- Exception |- (all other current built-in exceptions)

This rearrangement was done because people often want to catch all exceptions that indicate program errors. KeyboardInterrupt and SystemExitaren’t errors, though, and usually represent an explicit action such as the user hitting Control-C or code calling sys.exit(). A bare except: will catch all exceptions, so you commonly need to list KeyboardInterrupt andSystemExit in order to re-raise them. The usual pattern is:

try: ... except (KeyboardInterrupt, SystemExit): raise except: # Log error... # Continue running program...

In Python 2.5, you can now write except Exception to achieve the same result, catching all the exceptions that usually indicate errors but leavingKeyboardInterrupt and SystemExit alone. As in previous versions, a bare except: still catches all exceptions.

The goal for Python 3.0 is to require any class raised as an exception to derive from BaseException or some descendant of BaseException, and future releases in the Python 2.x series may begin to enforce this constraint. Therefore, I suggest you begin making all your exception classes derive fromException now. It’s been suggested that the bare except: form should be removed in Python 3.0, but Guido van Rossum hasn’t decided whether to do this or not.

Raising of strings as exceptions, as in the statement raise "Error occurred", is deprecated in Python 2.5 and will trigger a warning. The aim is to be able to remove the string-exception feature in a few releases.

See also

PEP 352 - Required Superclass for Exceptions

PEP written by Brett Cannon and Guido van Rossum; implemented by Brett Cannon.

PEP 353: Using ssize_t as the index type

A wide-ranging change to Python’s C API, using a new Py_ssize_t type definition instead of int, will permit the interpreter to handle more data on 64-bit platforms. This change doesn’t affect Python’s capacity on 32-bit platforms.

Various pieces of the Python interpreter used C’s int type to store sizes or counts; for example, the number of items in a list or tuple were stored in an int. The C compilers for most 64-bit platforms still defineint as a 32-bit type, so that meant that lists could only hold up to2**31 - 1 = 2147483647 items. (There are actually a few different programming models that 64-bit C compilers can use – seehttps://unix.org/version2/whatsnew/lp64_wp.html for a discussion – but the most commonly available model leaves int as 32 bits.)

A limit of 2147483647 items doesn’t really matter on a 32-bit platform because you’ll run out of memory before hitting the length limit. Each list item requires space for a pointer, which is 4 bytes, plus space for aPyObject representing the item. 2147483647*4 is already more bytes than a 32-bit address space can contain.

It’s possible to address that much memory on a 64-bit platform, however. The pointers for a list that size would only require 16 GiB of space, so it’s not unreasonable that Python programmers might construct lists that large. Therefore, the Python interpreter had to be changed to use some type other thanint, and this will be a 64-bit type on 64-bit platforms. The change will cause incompatibilities on 64-bit machines, so it was deemed worth making the transition now, while the number of 64-bit users is still relatively small. (In 5 or 10 years, we may all be on 64-bit machines, and the transition would be more painful then.)

This change most strongly affects authors of C extension modules. Python strings and container types such as lists and tuples now usePy_ssize_t to store their size. Functions such asPyList_Size() now return Py_ssize_t. Code in extension modules may therefore need to have some variables changed to Py_ssize_t.

The PyArg_ParseTuple() and Py_BuildValue() functions have a new conversion code, n, for Py_ssize_t. PyArg_ParseTuple()’ss# and t# still output int by default, but you can define the macro PY_SSIZE_T_CLEAN before including Python.h to make them return Py_ssize_t.

PEP 353 has a section on conversion guidelines that extension authors should read to learn about supporting 64-bit platforms.

See also

PEP 353 - Using ssize_t as the index type

PEP written and implemented by Martin von Löwis.

PEP 357: The ‘__index__’ method

The NumPy developers had a problem that could only be solved by adding a new special method, __index__(). When using slice notation, as in[start:stop:step], the values of the start, stop, and step indexes must all be either integers or long integers. NumPy defines a variety of specialized integer types corresponding to unsigned and signed integers of 8, 16, 32, and 64 bits, but there was no way to signal that these types could be used as slice indexes.

Slicing can’t just use the existing __int__() method because that method is also used to implement coercion to integers. If slicing used__int__(), floating-point numbers would also become legal slice indexes and that’s clearly an undesirable behaviour.

Instead, a new special method called __index__() was added. It takes no arguments and returns an integer giving the slice index to use. For example:

class C: def index (self): return self.value

The return value must be either a Python integer or long integer. The interpreter will check that the type returned is correct, and raises aTypeError if this requirement isn’t met.

A corresponding nb_index slot was added to the C-levelPyNumberMethods structure to let C extensions implement this protocol.PyNumber_Index(obj) can be used in extension code to call the__index__() function and retrieve its result.

See also

PEP 357 - Allowing Any Object to be Used for Slicing

PEP written and implemented by Travis Oliphant.

Other Language Changes

Here are all of the changes that Python 2.5 makes to the core Python language.

d = zerodict({1:1, 2:2})
print d[1], d[2] # Prints 1, 2
print d[3], d[4] # Prints 0, 0

Prints 'longest'

print max(L, key=len)

Prints 'short', because lexicographically 'short' has the largest value

print max(L)
(Contributed by Steven Bethard and Raymond Hettinger.)

Interactive Interpreter Changes

In the interactive interpreter, quit and exit have long been strings so that new users get a somewhat helpful message when they try to quit:

quit 'Use Ctrl-D (i.e. EOF) to exit.'

In Python 2.5, quit and exit are now objects that still produce string representations of themselves, but are also callable. Newbies who try quit()or exit() will now exit the interpreter as they expect. (Implemented by Georg Brandl.)

The Python executable now accepts the standard long options --helpand --version; on Windows, it also accepts the /? option for displaying a help message. (Implemented by Georg Brandl.)

Optimizations

Several of the optimizations were developed at the NeedForSpeed sprint, an event held in Reykjavik, Iceland, from May 21–28 2006. The sprint focused on speed enhancements to the CPython implementation and was funded by EWT LLC with local support from CCP Games. Those optimizations added at this sprint are specially marked in the following list.

New, Improved, and Removed Modules

The standard library received many enhancements and bug fixes in Python 2.5. Here’s a partial list of the most notable changes, sorted alphabetically by module name. Consult the Misc/NEWS file in the source tree for a more complete list of changes, or look through the SVN logs for all the details.

(Contributed by Guido van Rossum.)

'factory=None' uses email.Message.Message as the class representing

individual messages.

src = mailbox.Maildir('maildir', factory=None)
dest = mailbox.mbox('/tmp/mbox')
for msg in src:
dest.add(msg)
(Contributed by Gregory K. Johnson. Funding was provided by Google’s 2005 Summer of Code.)

The ctypes package

The ctypes package, written by Thomas Heller, has been added to the standard library. ctypes lets you call arbitrary functions in shared libraries or DLLs. Long-time users may remember the dl module, which provides functions for loading shared libraries and calling functions in them. The ctypes package is much fancier.

To load a shared library or DLL, you must create an instance of theCDLL class and provide the name or path of the shared library or DLL. Once that’s done, you can call arbitrary functions by accessing them as attributes of the CDLL object.

import ctypes

libc = ctypes.CDLL('libc.so.6') result = libc.printf("Line of output\n")

Type constructors for the various C types are provided: c_int(),c_float(), c_double(), c_char_p() (equivalent to char*), and so forth. Unlike Python’s types, the C versions are all mutable; you can assign to their value attribute to change the wrapped value. Python integers and strings will be automatically converted to the corresponding C types, but for other types you must call the correct type constructor. (And I mean must; getting it wrong will often result in the interpreter crashing with a segmentation fault.)

You shouldn’t use c_char_p() with a Python string when the C function will be modifying the memory area, because Python strings are supposed to be immutable; breaking this rule will cause puzzling bugs. When you need a modifiable memory area, use create_string_buffer():

s = "this is a string" buf = ctypes.create_string_buffer(s) libc.strfry(buf)

C functions are assumed to return integers, but you can set the restypeattribute of the function object to change this:

libc.atof('2.71828') -1783957616 libc.atof.restype = ctypes.c_double libc.atof('2.71828') 2.71828

ctypes also provides a wrapper for Python’s C API as thectypes.pythonapi object. This object does not release the global interpreter lock before calling a function, because the lock must be held when calling into the interpreter’s code. There’s a py_object type constructor that will create a PyObject* pointer. A simple usage:

import ctypes

d = {} ctypes.pythonapi.PyObject_SetItem(ctypes.py_object(d), ctypes.py_object("abc"), ctypes.py_object(1))

d is now {'abc', 1}.

Don’t forget to use py_object(); if it’s omitted you end up with a segmentation fault.

ctypes has been around for a while, but people still write and distribution hand-coded extension modules because you can’t rely onctypes being present. Perhaps developers will begin to write Python wrappers atop a library accessed through ctypes instead of extension modules, now that ctypes is included with core Python.

The ElementTree package

A subset of Fredrik Lundh’s ElementTree library for processing XML has been added to the standard library as xml.etree. The available modules areElementTree, ElementPath, and ElementInclude from ElementTree 1.2.6. The cElementTree accelerator module is also included.

The rest of this section will provide a brief overview of using ElementTree. Full documentation for ElementTree is available athttps://web.archive.org/web/20201124024954/http://effbot.org/zone/element-index.htm.

ElementTree represents an XML document as a tree of element nodes. The text content of the document is stored as the text and tailattributes of (This is one of the major differences between ElementTree and the Document Object Model; in the DOM there are many different types of node, including TextNode.)

The most commonly used parsing function is parse(), that takes either a string (assumed to contain a filename) or a file-like object and returns anElementTree instance:

from xml.etree import ElementTree as ET

tree = ET.parse('ex-1.xml')

feed = urllib.urlopen( 'http://planet.python.org/rss10.xml') tree = ET.parse(feed)

Once you have an ElementTree instance, you can call its getroot()method to get the root Element node.

There’s also an XML() function that takes a string literal and returns anElement node (not an ElementTree). This function provides a tidy way to incorporate XML fragments, approaching the convenience of an XML literal:

svg = ET.XML(""" """) svg.set('height', '320px') svg.append(elem1)

Each XML element supports some dictionary-like and some list-like access methods. Dictionary-like operations are used to access attribute values, and list-like operations are used to access child nodes.

Operation Result
elem[n] Returns n’th child element.
elem[m:n] Returns list of m’th through n’th child elements.
len(elem) Returns number of child elements.
list(elem) Returns list of child elements.
elem.append(elem2) Adds elem2 as a child.
elem.insert(index, elem2) Inserts elem2 at the specified location.
del elem[n] Deletes n’th child element.
elem.keys() Returns list of attribute names.
elem.get(name) Returns value of attribute name.
elem.set(name, value) Sets new value for attribute name.
elem.attrib Retrieves the dictionary containing attributes.
del elem.attrib[name] Deletes attribute name.

Comments and processing instructions are also represented as Elementnodes. To check if a node is a comment or processing instructions:

if elem.tag is ET.Comment: ... elif elem.tag is ET.ProcessingInstruction: ...

To generate XML output, you should call the ElementTree.write() method. Like parse(), it can take either a string or a file-like object:

Encoding is US-ASCII

tree.write('output.xml')

Encoding is UTF-8

f = open('output.xml', 'w') tree.write(f, encoding='utf-8')

(Caution: the default encoding used for output is ASCII. For general XML work, where an element’s name may contain arbitrary Unicode characters, ASCII isn’t a very useful encoding because it will raise an exception if an element’s name contains any characters with values greater than 127. Therefore, it’s best to specify a different encoding such as UTF-8 that can handle any Unicode character.)

This section is only a partial description of the ElementTree interfaces. Please read the package’s official documentation for more details.

The hashlib package

A new hashlib module, written by Gregory P. Smith, has been added to replace the md5 and sha modules. hashlib adds support for additional secure hashes (SHA-224, SHA-256, SHA-384, and SHA-512). When available, the module uses OpenSSL for fast platform optimized implementations of algorithms.

The old md5 and sha modules still exist as wrappers around hashlib to preserve backwards compatibility. The new module’s interface is very close to that of the old modules, but not identical. The most significant difference is that the constructor functions for creating new hashing objects are named differently.

Old versions

h = md5.md5() h = md5.new()

New version

h = hashlib.md5()

Old versions

h = sha.sha() h = sha.new()

New version

h = hashlib.sha1()

Hash that weren't previously available

h = hashlib.sha224() h = hashlib.sha256() h = hashlib.sha384() h = hashlib.sha512()

Alternative form

h = hashlib.new('md5') # Provide algorithm as a string

Once a hash object has been created, its methods are the same as before:update(string) hashes the specified string into the current digest state, digest() and hexdigest() return the digest value as a binary string or a string of hex digits, and copy() returns a new hashing object with the same digest state.

See also

The documentation for the hashlib module.

The sqlite3 package

The pysqlite module (https://www.pysqlite.org), a wrapper for the SQLite embedded database, has been added to the standard library under the package namesqlite3.

SQLite is a C library that provides a lightweight disk-based database that doesn’t require a separate server process and allows accessing the database using a nonstandard variant of the SQL query language. Some applications can use SQLite for internal data storage. It’s also possible to prototype an application using SQLite and then port the code to a larger database such as PostgreSQL or Oracle.

pysqlite was written by Gerhard Häring and provides a SQL interface compliant with the DB-API 2.0 specification described by PEP 249.

If you’re compiling the Python source yourself, note that the source tree doesn’t include the SQLite code, only the wrapper module. You’ll need to have the SQLite libraries and headers installed before compiling Python, and the build process will compile the module when the necessary headers are available.

To use the module, you must first create a Connection object that represents the database. Here the data will be stored in the/tmp/example file:

conn = sqlite3.connect('/tmp/example')

You can also supply the special name :memory: to create a database in RAM.

Once you have a Connection, you can create a Cursor object and call its execute() method to perform SQL commands:

c = conn.cursor()

Create table

c.execute('''create table stocks (date text, trans text, symbol text, qty real, price real)''')

Insert a row of data

c.execute("""insert into stocks values ('2006-01-05','BUY','RHAT',100,35.14)""")

Usually your SQL operations will need to use values from Python variables. You shouldn’t assemble your query using Python’s string operations because doing so is insecure; it makes your program vulnerable to an SQL injection attack.

Instead, use the DB-API’s parameter substitution. Put ? as a placeholder wherever you want to use a value, and then provide a tuple of values as the second argument to the cursor’s execute() method. (Other database modules may use a different placeholder, such as %s or :1.) For example:

Never do this -- insecure!

symbol = 'IBM' c.execute("... where symbol = '%s'" % symbol)

Do this instead

t = (symbol,) c.execute('select * from stocks where symbol=?', t)

Larger example

for t in (('2006-03-28', 'BUY', 'IBM', 1000, 45.00), ('2006-04-05', 'BUY', 'MSOFT', 1000, 72.00), ('2006-04-06', 'SELL', 'IBM', 500, 53.00), ): c.execute('insert into stocks values (?,?,?,?,?)', t)

To retrieve data after executing a SELECT statement, you can either treat the cursor as an iterator, call the cursor’s fetchone() method to retrieve a single matching row, or call fetchall() to get a list of the matching rows.

This example uses the iterator form:

c = conn.cursor() c.execute('select * from stocks order by price') for row in c: ... print row ... (u'2006-01-05', u'BUY', u'RHAT', 100, 35.140000000000001) (u'2006-03-28', u'BUY', u'IBM', 1000, 45.0) (u'2006-04-06', u'SELL', u'IBM', 500, 53.0) (u'2006-04-05', u'BUY', u'MSOFT', 1000, 72.0)

For more information about the SQL dialect supported by SQLite, seehttps://www.sqlite.org.

See also

https://www.pysqlite.org

The pysqlite web page.

https://www.sqlite.org

The SQLite web page; the documentation describes the syntax and the available data types for the supported SQL dialect.

The documentation for the sqlite3 module.

PEP 249 - Database API Specification 2.0

PEP written by Marc-André Lemburg.

The wsgiref package

The Web Server Gateway Interface (WSGI) v1.0 defines a standard interface between web servers and Python web applications and is described in PEP 333. The wsgiref package is a reference implementation of the WSGI specification.

The package includes a basic HTTP server that will run a WSGI application; this server is useful for debugging but isn’t intended for production use. Setting up a server takes only a few lines of code:

from wsgiref import simple_server

wsgi_app = ...

host = '' port = 8000 httpd = simple_server.make_server(host, port, wsgi_app) httpd.serve_forever()

Build and C API Changes

Changes to Python’s build process and to the C API include:

Port-Specific Changes

Porting to Python 2.5

This section lists previously described changes that may require changes to your code:

Acknowledgements

The author would like to thank the following people for offering suggestions, corrections and assistance with various drafts of this article: Georg Brandl, Nick Coghlan, Phillip J. Eby, Lars Gustäbel, Raymond Hettinger, Ralf W. Grosse-Kunstleve, Kent Johnson, Iain Lowe, Martin von Löwis, Fredrik Lundh, Andrew McNamara, Skip Montanaro, Gustavo Niemeyer, Paul Prescod, James Pryor, Mike Rovner, Scott Weikart, Barry Warsaw, Thomas Wouters.