[Python-Dev] Threading in the Standard Library Tour Part II (original) (raw)

Raymond Hettinger python at rcn.com
Mon Aug 16 02:09:13 CEST 2004


Aahz had suggested that the threading section of the tutorial's Standard Library Tour Part II be re-written with the idea of making people smarter about what Python threading can and cannot do, and about approaches most likely to assure success.

Please comment on the proposed revision listed below.

Raymond




Multi-threading

Threading is a technique for decoupling tasks which are not sequentially dependent and creating the illusion of concurrency. Threads can be used to improve the responsiveness of applications that accept user input while other tasks run in the background.

The following code shows how the high level threading module can run tasks in background while the main program continues to run:

import threading, zipfile

class AsyncZip(threading.Thread):
    def __init__(self, infile, outfile):
        threading.Thread.__init__(self)        
        self.infile = infile
        self.outfile = outfile
    def run(self):
        f = zipfile.ZipFile(self.outfile, 'w', zipfile.ZIP_DEFLATED)
        f.write(self.infile)
        f.close()
        print 'Finished background zip of: %s' % self.infile

background = AsyncZip('mydata.txt', 'myarchive.zip')
background.start()
print 'The main program continues to run in foreground.'

background.join()    # Wait for the background task to finish
print 'Main program waited until background was done.'

The principal challenge of multi-thread applications is coordinating threads that share data or other resources. To that end, the threading module provides a number of synchronization primitives including locks, events, condition variables, and semaphores.

While those tools are powerful, minor design errors can result in problems that are difficult to reproduce. Hence, the preferred approach to task coordination is to concentrate all access to a resource in a single thread and then use the Queue module to feed that thread with requests from other threads. Applications using Queue objects for inter-thread communication and coordination tend to be easier to design, more readable, and more reliable.

All that being said, a few cautions are in order. Thread programming is difficult to get right. And, its overhead decreases total application performance. Also, multiple processors cannot boost performance because Python's Global Interpreter Lock (GIL) precludes more than one thread from running in the interpreter at the same time (this was done to simplify re-entrancy issues). Another issue is that threading doesn't work with the event driven model used by most GUIs.



More information about the Python-Dev mailing list