Learning Python: Unit 10, System (original) (raw)

10. System Interfaces

System modules overview

Large built-in toolset:

sys, os, os.path, socket, signal, select, thread, threading, glob, shutil, tempfile, multiprocessing, subprocess, asyncio�

► See Python library manual for the full story

Large third-party domain:

PySerial, PyUSB, Pexpect, PySoap, Twisted, CORBA ORBs, �

► See the web for specific domains

Python tools: sys

● Python-related exports

● path: for imports, initialized from PYTHONPATH, changeable

● platform: for nonportable code, �win32�, �linux�, �sunos�, etc.

● sys.exit(N), sys.exc_info(), sys.executable, sys.version, sys.modules �

>>> import sys

>>> sys.path������������������������������ # first means CWD

['.', '/usr/local/lib/python', �etc�]

>>> sys.path

['C:/Python25', 'C:\\Python25\\Lib\\idlelib', 'C:\\WINDOWS\\system32\\python25.zip', �etc�

>>> sys.path

['', 'C:\\Users\\mark\\AppData\\Local\\Programs\\Python\\Python35\\python35.zip', �etc�

>>> sys.platform�������������������� # once upon a time�

'sunos4'

>>> if sys.platform[:3] == 'win':��� # or .startswith('win'), .startswith('linux')

���� print 'on Windows'

����

on Windows

>>> sys.executable

'C:\\Python25\\pythonw.exe'

>>> sys.executable

'C:\\Users\\mark\\AppData\\Local\\Programs\\Python\\Python35\\python.exe'

>>> sys.version

'2.5 (r25:51908, Sep 19 2006, 09:52:17) [MSC v.1310 32 bit (Intel)]'

>>> sys.version

'3.5.0 (v3.5.0:374f501f4567, Sep 13 2015, 02:27:37) [MSC v.1900 64 bit (AMD64)]'

>>> if float(sys.version[:3]) >= 3.5: print('recent')

...

recent

>>> sys.version_info

sys.version_info(major=3, minor=5, micro=0, releaselevel='final', serial=0)

>>> sys.modules.keys()

�loaded module names�

System tools: os

● POSIX bindings: operating system exports

● ~200+ attributes on some platforms, + nested �os.path� module

● Mostly portable, some calls absent on standard Windows Python

● Cygwin Python adds full UNIX call set on Windows

Content survey

► Shell environment variables

os.environ

► Running shell commands, programs:

os.system, os.popen, os.startfile, os.popen2/3/4 (2.X) (+subprocess)

► Spawning processes:

os.fork, os.pipe, os.exec, os.waitpid, os.kill

► Descriptor files, with locking:

os.open, os.read, os.write

► File processing:

os.remove, os.rename, os.mkfifo, os.mkdir, os.rmdir, os.removedirs

► Administrative tools:

os.getcwd, os.chdir, os.chmod, os.getpid

► Directory tools:

os.listdir, os.scandir (3.5+), os.walk

► Portability tools:

os.sep, os.pathsep, os.curdir, os.path.split, os.path.join

► os.path nested submodule: pathname tools

os.path.exists('filepathname')

os.path.isdir('filepathname')

os.path.getsize('filepathname')

os.path.getmtime('filepathname')

Running shell commands

>>> import os, sys

>>> lister = 'dir /B' if sys.platform.startswith('win') else 'ls'�������# folder listing command

>>> listing = os.popen('%s *.py' % lister).readlines()������������������ # or: glob.glob('*.py')

>>> for name in listing: print(name, end='')���������������������������� # 2.X: print name,

...

classtools.py

coroutine.py

person.py

>>> editor = 'notepad' if sys.platform.startswith('win') else 'vi'������# edit each in listing

>>> for name in listing: os.system(editor + ' ' + name)

...

Variations:

os.popen('cmd', 'w').write('data')

(i, o)���= os.popen2('cmd')����������� # popen2/3/4 removed in 3.X: use subprocess

(i, o, e) = os.popen3('cmd')

(i, o_e)�= os.popen4('cmd')

os.startfile('file')� # Windows only: registry

► See �pty� module, �pexpect� extension, to automate interactive programs without deadlocks

► See ahead: new �subprocess�module in 2.5+ for more low-level control over streams

Example: testing command-line scripts (advanced sessions)

Arguments, streams, shell variables

Shell environment variables: os

● os.environ: read/write access to shell variables

● normal dictionary interface

>>> import os

>>> os.environ['USER']

'mlutz'

>>> os.environ['USER'] = 'Bob'��� # changes for process and its children

Arguments and streams: sys

● sys.argv: command-line arguments

● sys.stdin/stdout/stderr: standard stream files

● sys.executable is path to running interpreter

% type play.py

#!/usr/local/bin/python

import sys

print sys.argv

sys.stdout.write("ta da!\n")���� # same as: print 'ta da!'

% python play.py -x -i spammify

['play.py', '-x', '-i', 'spammify']

ta da!

See getopt, optparse,argparse modules for parsing complex command lines

File tools

♦ Built-in file objects

For most file applications

♦ Processing binary and text data

Binary: use �rb� and �wb� in open() to suppress line-end translations on Windows

Binary: use bytes (b���) for data on 3.X, use str (���) on 2.X

Text, ASCII: use str and open() on both 2.X and 3.X

Text, Unicode: use str and open() on 3.X, use unicode (u���) and codecs.open() on 2.X (see final unit)

♦ Module os descriptor-based file tools

For special/advanced file processing modes

♦ Module os filename tools

Deletions, renamings, etc.

♦ Module os.path tools

File existence, directory tests, size, etc.

♦ See also

Sockets, pipes, fifos, shelves, DBM files

Directory tools

Single directories

1) Running directory listing commands: non-portable

C:\temp>python

>>> import os

>>> os.popen('dir /B').readlines()

['about-pp.html\n', 'python1.5.tar.gz\n', 'about-pp2e.html\n',�������� # \n was \012 in the past

'about-ppr2e.html\n', 'newdir\n']

>>> os.popen('ls C:\PP2ndEd').readlines()

['README.txt\n', 'cdrom\n', 'chapters\n', 'etc\n', 'examples\n',

'examples.tar.gz\n', 'figures\n', 'shots\n']

2) The glob module: patterns, in-process

>>> import glob

>>> glob.glob('C:\PP2ndEd\*')

['C:\\PP2ndEd\\examples.tar.gz', 'C:\\PP2ndEd\\README.txt',

'C:\\PP2ndEd\\shots', 'C:\\PP2ndEd\\figures', 'C:\\PP2ndEd\\examples',

'C:\\PP2ndEd\\etc', 'C:\\PP2ndEd\\chapters', 'C:\\PP2ndEd\\cdrom']

3) The os.listdir call: quick, portable

>>> os.listdir('C:\PP2ndEd')

['examples.tar.gz', 'README.txt', 'shots', 'figures', 'examples', 'etc',

'chapters', 'cdrom']

>>> os.listdir('.')

['summer.out', 'summer.py', 'table1.txt', ... ]

4) The os.scandir call: fastest, avoids extra systems calls, used in os.walk → but for 3.5+ only

>>> dirents = os.scandir('.')

>>> for dirent in dirents:

...����print(dirent.name, dirent.path, dirent.is_file())������ # see mergeall use case: 5x~10x faster

...

classtools.py .\classtools.py True

classtools.pyc .\classtools.pyc True

coroutine.py .\coroutine.py True

__pycache__ .\__pycache__ False

Directory trees

1) os.path.walk� (removed in3.X: use os.walk!)

>>> import os

>>> def lister(dummy, dirname, filesindir):

...����print '[' + dirname + ']'

...����for fname in filesindir:

...��������print os.path.join(dirname, fname)�������� # handle one file

...

>>> os.path.walk('.', lister, None)

[.]

.\about-pp.html

.\python1.5.tar.gz

.\about-pp2e.html

.\about-ppr2e.html

.\newdir

[.\newdir]

.\newdir\temp1

.\newdir\temp2

.\newdir\temp3

.\newdir\more

[.\newdir\more]

.\newdir\more\xxx.txt

.\newdir\more\yyy.txt

2) os.walk generator (2.3+)

>>> import os

>>> for (thisDir, dirsHere, filesHere) in os.walk('.'):

�������print thisDir, '=>'

�������for filename in filesHere:

�����������print '\t', filename

���������

. =>

���� w9xpopen.exe

���� py.ico

���� pyc.ico

���� README.txt

���� NEWS.txt

���� �

3) The find module: deprecated in 2.0, gone today (but see replacement below)

C:\temp>python

>>> import find

>>> find.find('*')

['.\\about-pp.html', '.\\about-pp2e.html', '.\\about-ppr2e.html',

'.\\newdir', '.\\newdir\\more', '.\\newdir\\more\\xxx.txt',

'.\\newdir\\more\\yyy.txt', '.\\newdir\\temp1', '.\\newdir\\temp2',

'.\\newdir\\temp3', '.\\python1.5.tar.gz']

4) Recursive traversals

# list files in dir tree by recursion

import sys, os

def mylister(currdir):

���print '[' + currdir + ']'

��� for file in os.listdir(currdir):������������� # list files here

�������path = os.path.join(currdir, file)�������# add dir path back

�������if not os.path.isdir(path):

�����������print path

�������else:

�����������mylister(path)�����������������������# recur into subdirs

if __name__ == '__main__':

���mylister(sys.argv[1])������������������������ # dir name in cmdline

Example: finding large files (advanced sessions)

Renaming a set of files

>>> import glob, string, os

>>> glob.glob("*.py")

['cheader1.py', 'finder1.py', 'summer.py']

>>> for name in glob.glob("*.py"):

...���� os.rename(name, string.upper(name))

...

>>> glob.glob("*.PY")

['FINDER1.PY', 'SUMMER.PY', 'CHEADER1.PY']

Rolling your own find module (see also Extras\Code\Misc)

#!/usr/bin/python

"""

Return all files matching a filename pattern at and below a root directory;

custom version of the now deprecated find module in the standard library:

import as "PP4E.Tools.find"; like original, but uses os.walk loop, has no

support for pruning subdirs, and is runnable as a top-level script;

find() is a generator that uses the os.walk() generator to yield just

matching filenames: use findlist() to force results list generation;

"""

import fnmatch, os

def find(pattern, startdir=os.curdir):

��� for (thisDir, subsHere, filesHere) in os.walk(startdir):

�������for name in subsHere + filesHere:

�����������if fnmatch.fnmatch(name, pattern):

���������������fullpath = os.path.join(thisDir, name)

���������������yield fullpath

def findlist(pattern, startdir=os.curdir, dosort=False):

���matches = list(find(pattern, startdir))

��� if dosort: matches.sort()

���return matches

if __name__ == '__main__':

���import sys

���namepattern, startdir = sys.argv[1], sys.argv[2]

��� for name in find(namepattern, startdir): print(name)

Forking processes

♦ Spawns (copies) a program

♦ Parent and child run independently

♦ _fork_spawns processes; system/popen spawn commands

♦ Not available on standard Windows Python today:

►use threads, spawnv, multiprocessing, subprocess, Cygwin (or coroutines?)

# starts programs until you type 'q'

import os

parm = 0

while 1:

��� parm = parm+1

��� pid = os.fork()

��� if pid == 0:�������������������������������������������� # copy process

������� os.execlp('python', 'python', 'child.py', str(parm)) # overlay program

�������assert 0, 'error starting program'������������������ # shouldn't return

���else:

�������print 'Child is', pid

�������if raw_input() == 'q': break

See other process examples ahead in IPC section

Python thread modules

♦ Runs function calls in parallel, share global (module) memory

♦ Portable: runs on Windows, Solaris, any with pthreads

♦ Global interpreter lock: one thread running code at a time

♦ Thread switches on bytecode counter and long-running calls (later 3.X: on timeouts)

♦ Must still synchronize concurrent updates with thread locks

♦ C extensions release and acquire global lock too

import thread�� # _thread in 3.X

def counter(myId, count):

��� # synchronize stdout access to avoid multi prints on 1 line

��� for i in range(count):

������� mutex.acquire()

������� print '[%s] => %s' % (myId, i)

������� mutex.release()

mutex = thread.allocate_lock()

for i in range(10):

��� thread.start_new(counter, (i, 100))

import time

time.sleep(10)

print 'Main thread exiting.'

Output

.

.

.

[3] => 98

[4] => 98

[5] => 98

[7] => 98

[8] => 98

[0] => 99

[9] => 98

[6] => 99

[1] => 99

[2] => 99

[3] => 99

[4] => 99

[5] => 99

[7] => 99

[8] => 99

[9] => 99

Main thread exiting.

Locking concurrent updaters

Fails:

# fails on windows due to concurrent updates;

# works if check-interval set higher or lock

# acquire/release calls made around the adds

import thread, time

count = 0

def adder():

���global count

���count = count + 1�������� # update shared global

���count = count + 1�������� # thread swapped out before returns

for i in range(100):

���thread.start_new(adder, ())��� # start 100 update threads

time.sleep(5)

print count

Works:

import thread, time

mutex = thread.allocate_lock()

count = 0

def adder():

���global count

��� mutex.acquire()

���count = count + 1�������� # update shared global

���count = count + 1�������� # thread swapped out before returns

��� mutex.release()

for i in range(100):

���thread.start_new(adder, ())��� # start 100 update threads

time.sleep(5)

print count

Also Works:

thread lock context manager (auto acquire/release, like close for files)

import thread, time

mutex = thread.allocate_lock()

count = 0

def adder(lock):

���global count

���time.sleep(0.10)

��� with lock:

�������count = count + 1�������� # update shared global

�������count = count + 1

for i in range(100):

���thread.start_new_thread(adder, (mutex,))��� # start 100 update threads

time.sleep(11)

print count

See also:

�threading� module�s class-based interface

�Queue� module�s thread-safe queue get/put

import threading

����

class mythread(threading.Thread):��������� # subclass Thread object

��� def __init__(self, myId, count):

������� self.myId� = myId

������� self.count = count

������� threading.Thread.__init__(self)

��

��� def run(self):������������������������ # run provides thread logic

������� for i in range(self.count):������� # still synch stdout access

����������� stdoutmutex.acquire()

����������� print '[%s] => %s' % (self.myId, i)

����������� stdoutmutex.release()

����

stdoutmutex = threading.Lock()������������ # same as thread.allocate_lock()

thread = mythread()

thread.start()

thread.join()����������������������������� # wait for exit

Coding Options for threading

import threading, _thread

def action(i):

��� print(i ** 32)

# subclass with state

class Mythread(threading.Thread):

��� def __init__(self, i):

������� self.i = i

������� threading.Thread.__init__(self)

��� def run(self):��������������������������������������� # redefine run for action

������� print(self.i ** 32)

Mythread(2).start()�������������������������������������� # start invokes run()

# pass action in

thread = threading.Thread(target=(lambda: action(2)))����# run invokes target

thread.start()

# same but no lambda wrapper for state

threading.Thread(target=action, args=(2,)).start()������� # callable plus its args

# basic thread module

thread.start_new_thread(action, (2,))�������������������� # all-function interface

Thread Queue

"producer and consumer threads communicating with a shared queue"

numconsumers = 2����������������� # how many consumers to start

numproducers = 4����������������� # how many producers to start

nummessages�= 4����������������� # messages per producer to put

import _thread as thread, queue, time

safeprint = thread.allocate_lock()��� # else prints may overlap

dataQueue = queue.Queue()������������ # shared global, infinite size

def producer(idnum):

��� for msgnum in range(nummessages):

�������time.sleep(idnum)

�������dataQueue.put('[producer id=%d, count=%d]' % (idnum, msgnum))

def consumer(idnum):

���while True:

�������time.sleep(0.1)

�������try:

�����������data = dataQueue.get(block=False)

�������except queue.Empty:

�����������pass

�������else:

�����������with safeprint:

���������������print('consumer', idnum, 'got =>', data)

if __name__ == '__main__':

��� for i in range(numconsumers):

�������thread.start_new_thread(consumer, (i,))

��� for i in range(numproducers):

�������thread.start_new_thread(producer, (i,))

���time.sleep(((numproducers-1) * nummessages) + 1)

���print('Main thread exit.')

C:\...\PP4E\System\Threads> queuetest.py

consumer 1 got => [producer id=0, count=0]

consumer 0 got => [producer id=0, count=1]

consumer 1 got => [producer id=0, count=2]

consumer 0 got => [producer id=0, count=3]

consumer 1 got => [producer id=1, count=0]

consumer 1 got => [producer id=2, count=0]

consumer 0 got => [producer id=1, count=1]

consumer 1 got => [producer id=3, count=0]

consumer 0 got => [producer id=1, count=2]

consumer 1 got => [producer id=2, count=1]

consumer 1 got => [producer id=1, count=3]

consumer 1 got => [producer id=3, count=1]

consumer 0 got => [producer id=2, count=2]

consumer 1 got => [producer id=2, count=3]

consumer 1 got => [producer id=3, count=2]

consumer 1 got => [producer id=3, count=3]

Main thread exit.

Other Examples: Queue module, on CD Extras\Code\pp3e

Newer modules: subprocess and multiprocessing

Shell commands: os.system, os.popen, subprocess

# testexit_sys.py

def later():

���import sys

���print('Bye sys world')

���sys.exit(42)

��� print('Never reached')

if __name__ == '__main__': later()

Basic os module tools

C:\...\PP4E\System\Exits> python

>>> os.system('python testexit_sys.py')

Bye sys world

42

>>> pipe = os.popen('python testexit_sys.py')

>>> pipe.read()

'Bye sys world\n'

>>> pipe.close()

42

subprocess gives more control: 3 ways

C:\...\PP4E\System\Exits> python

>>> from subprocess import Popen, PIPE, call

>>> pipe = Popen('python testexit_sys.py', stdout=PIPE)

>>> pipe.stdout.read()

b'Bye sys world\r\n'

>>> pipe.wait()

42

>>> call('python testexit_sys.py')

Bye sys world

42

>>> pipe = Popen('python testexit_sys.py', stdout=PIPE)

>>> pipe.communicate()

(b'Bye sys world\r\n', None)

>>> pipe.returncode

42

Sending input

>>> pipe = Popen('python hello-in.py', stdin=PIPE)

>>> pipe.stdin.write(b'Pokey\n')

>>> pipe.stdin.close()

>>> pipe.wait()

0

Both: input and output

>>> pipe = Popen('python reader.py', stdin=PIPE, stdout=PIPE)

>>> pipe.stdin.write(b'Lumberjack\n')

>>> pipe.stdin.write(b'12\n')

>>> pipe.stdin.close()

>>> output = pipe.stdout.read()

>>> pipe.wait()

0

>>> output

b'Got this: "Lumberjack"\r\nThe meaning of life is 12 24\r\n'

Tying programs� streams together with pipes

>>> p1 = Popen('python writer.py', stdout=PIPE)

>>> p2 = Popen('python reader.py', stdin=p1.stdout, stdout=PIPE)

>>> output = p2.communicate()[0]

>>> output

b'Got this: "Help! Help! I\'m being repressed!"\r\nThe meaning of life is 42 84\r\n'

>>> p2.returncode

0

Multiprocessing: processes with threading API

+: Portability of threads + parallel performance of processes

-: Pickleability constraints (bound methods), not freely shared state

# Example 5-29. PP4E\System\Processes\multi1.py

"""

multiprocess basics: Process works like threading.Thread, but

runs function call in parallel in a process instead of a thread;

locks can be used to synchronize, e.g. prints on some platforms;

starts new interpreter on windows, forks a new process on unix;

"""

import os

from multiprocessing import Process, Lock

def whoami(label, lock):

��� msg = '%s: name:%s, pid:%s'

��� with lock:

�������print(msg % (label, __name__, os.getpid()))

if __name__ == '__main__':

��� lock = Lock()

���whoami('function call', lock)

��� p = Process(target=whoami, args=('spawned child', lock))

���p.start()

���p.join()

��� for i in range(5):

�������Process(target=whoami, args=(('run process %s' % i), lock)).start()

��� with lock:

�������print('Main process exit.')

C:\...\PP4E\System\Processes> multi1.py

function call: name:__main__, pid:8752

spawned child: name:__main__, pid:9268

Main process exit.

run process 3: name:__main__, pid:9296

run process 1: name:__main__, pid:8792

run process 4: name:__main__, pid:2224

run process 2: name:__main__, pid:8716

run process 0: name:__main__, pid:6936

# Example 5-30. PP4E\System\Processes\multi2.py

"""

Use multiprocess anonymous pipes to communicate. Returns 2 connection

object representing ends of the pipe: objects are sent on one end and

received on the other, though pipes are bidrectional by default

"""

import os

from multiprocessing import Process, Pipe

def sender(pipe):

���"""

��� send object to parent on anonymous pipe

���"""

���pipe.send(['spam'] +� [42, 'eggs'])

���pipe.close()

def talker(pipe):

���"""

��� send and receive objects on a pipe

���"""

���pipe.send(dict(name='Bob', spam=42))

���reply = pipe.recv()

���print('talker got:', reply)

if __name__ == '__main__':

���(parentEnd, childEnd) = Pipe()������������������

��� Process(target=sender, args=(childEnd,)).start()������� # spawn child with pipe

���print('parent got:', parentEnd.recv())����������������� # receive from child

���parentEnd.close()�������������������������������������� # or auto-closed on gc

��� (parentEnd, childEnd) = Pipe()

���child = Process(target=talker, args=(childEnd,))

���child.start()

���print('parent got:', parentEnd.recv())����������������� # receieve from child

���parentEnd.send({x * 2 for x in 'spam'})���������������� # send to child

���child.join()������������������������������������������� # wait for child exit

���print('parent exit')

C:\...\PP4E\System\Processes> multi2.py

parent got: ['spam', 42, 'eggs']

parent got: {'name': 'Bob', 'spam': 42}

talker got: {'ss', 'aa', 'pp', 'mm'}

parent exit

# Example 5-32. PP4E\System\Processes\multi4.py

"""

Process class can also be subclassed just like threading.Thread;

Queue works like queue.Queue but for cross-process, not cross-thread

"""

import os, time, queue

from multiprocessing import Process, Queue���������� # process-safe shared queue

����������������������������������������������������# queue is a pipe + locks/semas

class Counter(Process):

���label = '� @'

��� def __init__(self, start, queue):���������������# retain state for use in run

�������self.state = start

�������self.post� = queue

�������Process.__init__(self)

��� def run(self):����������������������������������# run in newprocess on start()

�������for i in range(3):

�����������time.sleep(1)

�� ���������self.state += 1

�����������print(self.label ,self.pid, self.state)�# self.pid is this child's pid

�����������self.post.put([self.pid, self.state])���# stdout file is shared by all

�������print(self.label, self.pid, '-')

if __name__ == '__main__':

���print('start', os.getpid())

���expected = 9

��� post = Queue()

��� p = Counter(0, post)����������������������� # start 3 processes sharing queue

��� q = Counter(100, post)��������������������� # children are producers

��� r = Counter(1000, post)

���p.start(); q.start(); r.start()

���while expected:���������������������������� # parent consumes data on queue������������������������

�������time.sleep(0.5)������������������������# this is essentially like a GUI,

�������try:����������������� ������������������# though GUIs often use threads

�����������data = post.get(block=False)

�������except queue.Empty:

�����������print('no data...')

�������else:

�����������print('posted:', data)

�����������expected -= 1

���p.join(); q.join(); r.join()��������������� # must get before join putter

���print('finish', os.getpid(), r.exitcode)��� # exitcode is child exit status

C:\...\PP4E\System\Processes> multi4.py

start 6296

no data...

no data...

� @ 8008 101

posted: [8008, 101]

� @ 6068 1

� @ 3760 1001

posted: [6068, 1]

� @ 8008 102

posted: [3760, 1001]

� @ 6068 2

� @ 3760 1002

posted: [8008, 102]

� @ 8008 103

� @ 8008 -

posted: [6068, 2]

� @ 6068 3

� @ 6068 -

� @ 3760 1003

� @ 3760 -

posted: [3760, 1002]

posted: [8008, 103]

posted: [6068, 3]

posted: [3760, 1003]

finish 6296 0

# Example 5-33. PP4E\System\Processes\multi5.py

"Use multiprocessing to start independent programs, os.fork or not"

import os

from multiprocessing import Process

def runprogram(arg):

���os.execlp('python', 'python', 'child.py', str(arg))����

if __name__ == '__main__':

��� for i in range(5):

�������Process(target=runprogram, args=(i,)).start()

���print('parent exit')

IPC tools: pipes, sockets, and signals

Anonymous pipes

import os, time

def child(pipeout):

��� zzz = 0

���while True:

�������time.sleep(zzz)�������������������������# make parent wait

�������msg = ('Spam %03d' % zzz).encode()������# pipes are binary bytes

�������os.write(pipeout, msg)������������������# send to parent

�������zzz = (zzz+1) % 5�����������������������# goto 0 after 4

def parent():

���pipein, pipeout = os.pipe()����������������� # make 2-ended pipe

��� if os.fork() == 0:��������������������������# copy this process

�������child(pipeout)��������������������������# in copy, run child

���else:��������������������������������������� # in parent, listen to pipe

�������while True:

�����������line = os.read(pipein, 32)����������# blocks until data sent

�����������print('Parent %d got [%s] at %s' % (os.getpid(), line, time.time()))

parent()

Example: cross-linking streams

♦ Input → connect stdin to program�s stdout: raw_input

♦ Output → connect stdout to program�s stdin: print

♦ Also see: os.popen2 call in Python 2.X

file: ipc.py

import os

def spawn(prog, args):

��� pipe1 = os.pipe()����� # (parent input, child output)

��� pipe2 = os.pipe()����� # (child input,� parent output)

��� pid = os.fork()������� # make a copy of this process

��� if pid:

������� # in parent process

�������os.close(pipe1[1])�������� # close child ends here

�������os.close(pipe2[0])

�������os.dup2(pipe1[0], 0)������ # sys.stdin� = pipe1[0]

�������os.dup2(pipe2[1], 1)������ # sys.stdout = pipe2[1]

��� else:

������� # in child process

���� ���os.close(pipe1[0])�������� # close parent ends here

�������os.close(pipe2[1])

�������os.dup2(pipe2[0], 0)������ # sys.stdin� = pipe2[0]

�������os.dup2(pipe1[1], 1)������ # sys.stdout = pipe1[1]

������� cmd = (prog,) + args

�������os.execv(prog, cmd)������� # overlay new program

Named pipes (fifos)

"""

named pipes; os.mkfifo is not available on Windows (without Cygwin);

there is no reason to fork here, since fifo file pipes are external

to processes--shared fds in parent/child processes are irrelevent;

"""

import os, time, sys

fifoname = '/tmp/pipefifo'���������������������� # must open same name

def child():

���pipeout = os.open(fifoname, os.O_WRONLY)���� # open fifo pipe file as fd

��� zzz = 0

���while True:

�������time.sleep(zzz)

����� ��msg = ('Spam %03d\n' % zzz).encode()���� # binary as opened here

�������os.write(pipeout, msg)

�������zzz = (zzz+1) % 5

def parent():

���pipein = open(fifoname, 'r')���������������� # open fifo as text file object

���while True:

�������line = pipein.readline()[:-1]�����������# blocks until data sent

�������print('Parent %d got "%s" at %s' % (os.getpid(), line, time.time()))

if __name__ == '__main__':

��� if not os.path.exists(fifoname):

�������os.mkfifo(fifoname)���������������������# create a named pipe file

��� if len(sys.argv) == 1:

�������parent()��������������������������������# run as parent if no args

���else:��������������������������������������� # else run as child process

�������child()

[C:\...\PP4E\System\Processes] $ python pipefifo.py���������� # parent window

Parent 8324 got "Spam 000" at 1268003696.07

Parent 8324 got "Spam 001" at 1268003697.06

Parent 8324 got "Spam 002" at 1268003699.07

Parent 8324 got "Spam 003" at 1268003702.08

Parent 8324 got "Spam 004" at 1268003706.09

Parent 8324 got "Spam 000" at 1268003706.09

Parent 8324 got "Spam 001" at 1268003707.11

...etc: Ctrl-C to exit...

[C:\...\PP4E\System\Processes]$ file /tmp/pipefifo����������� # child window

/tmp/pipefifo: fifo (named pipe)

[C:\...\PP4E\System\Processes]$ python pipefifo.py -child

...Ctrl-C to exit...

Sockets (a first look: see Internet unit)

"""

sockets for cross-task communication: start threads to communicate over sockets;

independent programs can too, because sockets are system-wide, much like fifos;

see the GUI and Internet parts of the book for more realistic socket use cases;

some socket servers may also need to talk to clients in threads or processes;

sockets pass byte strings, but can be pickled objects or encoded Unicode text;

caveat: prints in threads may need to be synchronized if their output overlaps;

"""

from socket import socket, AF_INET, SOCK_STREAM���� # portable socket api

port = 50008���������������� # port number identifies socket on machine

host = 'localhost'���������� # server and client run on same local machine here

def server():

��� sock = socket(AF_INET, SOCK_STREAM)�������� # ip addresses tcp connection

���sock.bind(('', port))���������������������� # bind to port on this machine

���sock.listen(5)�������������� ���������������# allow up to 5 pending clients

���while True:

�������conn, addr = sock.accept()�������������# wait for client to connect

�������data = conn.recv(1024)�����������������# read bytes data from this client

�������reply = 'server got: [%s]' % data������# conn is a new connected socket

�������conn.send(reply.encode())��������������# send bytes reply back to client

def client(name):

��� sock = socket(AF_INET, SOCK_STREAM)

���sock.connect((host, port))����������������� # connect to a socket port

���sock.send(name.encode())������������������� # send bytes data to listener

���reply = sock.recv(1024)�������������������� # receive bytes data from listener

���sock.close()������������������������������� # up to 1024 bytes in message

���print('client got: [%s]' % reply)

if __name__ == '__main__':

��� from threading import Thread

���sthread = Thread(target=server)

���sthread.daemon = True���������������������� # don't wait for server thread

���sthread.start()���������������������������� # do wait for children to exit

��� for i in range(5):

��������Thread(target=client, args=('client%s' % i,)).start()

C:\...\PP4E\System\Processes> socket_preview.py

client got: [b"server got: [b'client1']"]

client got: [b"server got: [b'client3']"]

client got: [b"server got: [b'client4']"]

client got: [b"server got: [b'client2']"]

client got: [b"server got: [b'client0']"]

"""

same socket, but talk between independent programs too, not just threads;

server here runs in a process and serves both process and thread clients;

sockets are machine-global, much like fifos: don't require shared memory

"""

from socket_preview import server, client�������� # both use same port number

import sys, os

from threading import Thread

mode = int(sys.argv[1])

if mode == 1:������������������������������������ # run server in this process

���server()

elif mode == 2:���������������������������������� # run client in this process

���client('client:process=%s' % os.getpid())

else:������������������������������������� �������# run 5 client threads in process

��� for i in range(5):

�������Thread(target=client, args=('client:thread=%s' % i,)).start()

C:\...\PP4E\System\Processes> socket-preview-progs.py 1���� # server window

C:\...\PP4E\System\Processes> socket-preview-progs.py 2���� # client window

client got: [b"server got: [b'client:process=7384']"]

C:\...\PP4E\System\Processes> socket-preview-progs.py 2

client got: [b"server got: [b'client:process=7604']"]

C:\...\PP4E\System\Processes> socket-preview-progs.py 3

client got: [b"server got: [b'client:thread=1']"]

client got: [b"server got: [b'client:thread=2']"]

client got: [b"server got: [b'client:thread=0']"]

client got: [b"server got: [b'client:thread=3']"]

client got: [b"server got: [b'client:thread=4']"]

C:\..\PP4E\System\Processes> socket-preview-progs.py 3

client got: [b"server got: [b'client:thread=3']"]

client got: [b"server got: [b'client:thread=1']"]

client got: [b"server got: [b'client:thread=2']"]

client got: [b"server got: [b'client:thread=4']"]

client got: [b"server got: [b'client:thread=0']"]

C:\...\PP4E\System\Processes> socket-preview-progs.py 2

client got: [b"server got: [b'client:process=6428']"]

Signals

"""

catch signals in Python; pass signal number N as a command-line arg,

use a "kill -N pid" shell command to send this process a signal;� most

signal handlers restored by Python after caught (see network scripting

chapter for SIGCHLD details); on Windows, signal module is available,

but it defines only a few signal types there, and os.kill is missing;

"""

import sys, signal, time

def now(): return time.asctime()���������������� # current time string

def onSignal(signum, stackframe):��������������� # python signal handler

���print('Got signal', signum, 'at', now())���� # most handlers stay in effect

signum = int(sys.argv[1])

signal.signal(signum, onSignal)����������������� # install signal handler

while True: signal.pause()���������������������� # wait for signals (or: pass)

Fork versus spawnv

♦ For starting programs on Windows

♦ Spawnv like fork+exec for Unix

♦ See also: os.system(�start file.py�)

############################################################

# do something simlar by forking process instead of threads

# this doesn't currently work on Windows, because it has no

# os.fork call; use os.spawnv to start programs on Windows

# instead; spawnv is roughly like a fork+exec combination;

############################################################

import os, sys

for i in range(10):

���if sys.platform[:3] == 'win':

�������path = r'C:\program files\python\python.exe'

�������os.spawnv(os.P_DETACH, path,

����������������� ('python', 'thread-basics6.py'))

���else:

�������pid = os.fork()

�������if pid != 0:

�����������print 'Process %d spawned' % pid

�������else:

�����������os.execlp('python', 'python', 'thread-basics6.py')

print 'Main process exiting.'

site-forward.py

#######################################################

# Create forward link pages for relocating a web site.

# Generates one page for every existing site file;

# upload the generated files to your old web site.

#######################################################

import os, string

uploaddir��� = 'rmi-forward'������������ # where to store forward files

servername�� = 'starship.python.net'���� # where site is relocating to

homedir����� = '~lutz/home'������������� # where site will be rooted

sitefilesdir = 'public_html'������������ # where site files live locally

templatename = 'template.html'���������� # template for generated pages

template�= open(templatename).read()

sitefiles = os.listdir(sitefilesdir)���� # filenames, no dir prefix

count = 0

for filename in sitefiles:

���fwdname = os.path.join(uploaddir, filename)�� # or +os.sep+filename

���print 'creating', filename, 'as', fwdname

���filetext = string.replace(template, '$server$', servername)

���filetext = string.replace(filetext, '$home$',�� homedir)��

���filetext = string.replace(filetext, '$file$',�� filename)

���open(fwdname, 'w').write(filetext)

���count = count + 1

print 'Last file =>\n', filetext

print 'Done:', count, 'forward files created.'

template.html

This page has moved

This page now lives at this address:

http://$server$/$home$/$file$

Please click on the new address to jump to this page, and

update any links accordingly.

Optional examples: packing/unpacking text files

pack1 � puts files in a single file, with separator lines

unpack1 recreates original files from a �pack1� file

unmore �� unpacks result of a �more � command

file: pack1.py

#!/usr/local/bin/python

import sys��������������������� # load the system module

marker = ':'*6

for name in sys.argv[1:]:������ # for all command arguments

��� input = open(name, 'r')���� # open the next input file

��� print marker + name�������� # write a separator line

��� print input.read(),�������� # write the file's contents

file: unpack1.py

#!/usr/local/bin/python

import sys

marker = ':'*6

for line in sys.stdin():������������� # for all input lines

��� if line[:6] != marker:

������� print line,������������������ # write real lines

��� else:

�������� sys.stdout = open(line[6:-1], 'w')

Unmore: scripts, functions, classes

► �more� writes 3 lines before each file

::::::::::::::

filename

::::::::::::::

file unmore.py

#!/usr/local/bin/python

# unpack result of "more x y z > f"

# usage: "% unmore.py f" or "% unmore.py < f"

# uses simple top-level script logic

import sys

marker = ':'*14

try:

��� input = open(sys.argv[1], "r")

except:

��� input = sys.stdin

output = sys.stdout

while 1:

��� line = input.readline()

��� if not line:������������������������ # end of file?

������� break

��� elif line[:14] != marker:����������� # text line?

������� output.write(line)

��� else:������������������������������� # file prefix

������� fname = input.readline()[:-1]��� # strip eoln ('\n')

������� print 'creating', `fname`

������� output = open(fname, "w")������� # next output

������� line = input.readline()��������� # end of prefix

������� if line[:14] != marker:

����������� print "OOPS!"; sys.exit(1)

print 'Done.'

Adding a functional interface

file: unmore2.py

#!/usr/local/bin/python

# unpack result of "more x y z > f"

# usage: "unmore2.py f" or "unmore2.py < f"

# packages unpacking logic as an importable function

import sys

marker = ':'*14

def unmore(input):

��� output = sys.stdout

��� while 1:

������� line = input.readline()

������� if not line:����������������������� # end of file?

����������� break

������� elif line[:14] != marker:������������� # text line?

����������� output.write(line)

������� else:��������������������������������� # file prefix

����������� fname = input.readline()[:-1]����� # strip eoln

����������� print 'creating', `fname`

����������� output = open(fname, "w")��������� # next output

����������� if input.readline()[:14] != marker:

��������������� print "OOPS!"; sys.exit(1)

if __name__ == '__main__':

��� if len(sys.argv) == 1:

������� unmore(sys.stdin)����������������� # unmore2.py < f

��� else:

������� unmore(open(sys.argv[1], 'r'))���� # unmore2.py f

��� print 'Done.'

% unmore2.py t.more

creating 't1.txt'

creating 't2.txt'

Done.

% python

>>> from unmore2 import unmore

>>> unmore(open("t.more", "r"))

creating 't1.txt'

creating 't2.txt'

Optional: supplemental examples

These are suggested reading, if you are looking for something to do during the lab session.

File: fixeoln_one.py

#########################################################

# Use: "python fixeoln_one.py [tounix|todos] filename".

# convert end-lines in the single text file whose name

# is passed in on the command line, to the target form

# (unix or dos).� The _one, _dir, and _all converters

# resuse the convert function here; we could implement

# this by inspecting command-line argument patterns

# instead of writing 3 separate scripts, but that can

# become complex for the user.�convertEndlines changes

# endlines only if necessary--lines that are already in

# the target format are left unchanged, so it's okay to

# convert a file > once in any of the 3 fixeoln scripts.

#########################################################

def convertEndlines(format, fname):��������������������� # convert one file

��� newlines = []��������������������������������������� # todos:� \n�� => \r\n

��� for line in open(fname, 'r').readlines():����������� # tounix: \r\n => \n

������� if format == 'todos':

����������� if line[-1:] == '\n' and line[-2:-1] != '\r':

��������������� line = line[:-1] + '\r\n'

������� elif format == 'tounix':������������������������ # avoids IndexError

����������� if line[-2:] == '\r\n':��������������������� # slices are scaled

��������������� line = line[:-2] + '\n'

������� newlines.append(line)

��� open(fname, 'w').writelines(newlines)

if __name__ == '__main__':

��� import sys

��� errmsg = 'Required arguments missing: ["todos"|"tounix"] filename'

��� assert (len(sys.argv) == 3 and sys.argv[1] in ['todos', 'tounix']), errmsg

��� convertEndlines(sys.argv[1], sys.argv[2])

��� print 'Converted', sys.argv[2]

File: fixeoln_dir.py

#########################################################

# Use: "python fixeoln_dir.py [tounix|todos] patterns?".

# convert end-lines in all the text files in the current

# directory (only: does not recurse to subdirectories).

# Resuses converter in the single-file _one version.

#########################################################

import sys, glob

from fixeoln_one import convertEndlines

listonly = 0

patts = ['*.py', '*.txt', '*.c', '*.cxx', '*.h', '*.i', 'makefile*', 'output*']

if __name__ == '__main__':

��� errmsg = 'Required first argument missing: "todos" or "tounix"'

��� assert (len(sys.argv) >= 2 and sys.argv[1] in ['todos', 'tounix']), errmsg

��� if len(sys.argv) > 2:���������������� # glob anyhow: '*' not applied on dos

������� patts = sys.argv[2:]������������� # though not really needed on linux

��� filelists = map(glob.glob, patts)���� # name matches in this dir only

��� count = 0

��� for list in filelists:

������� for fname in list:

����������� print count+1, '=>', fname

����������� if not listonly:

��������������� convertEndlines(sys.argv[1], fname)

����������� count = count + 1

��� print 'Converted %d files' % count

File: fixeoln_all.py

#########################################################

# Use: "python fixeoln_all.py [tounix|todos] patterns?".

# find and convert end-of-lines in all text files

# at and below the directory where this script is

# run (the dir you are in when you type 'python').

# If needed, tries to use the Python find.py lib

# module, else reads the output of a unix-style

# find executable command; we could also use a

# find -exec option to spawn a coverter script.

# Uses default filename patterns list if absent.

# Example:

#��� cd Html\Examples

#��� python ..\..\Tools\fixeoln_all.py tounix

# converts any DOS end-lines to UNIX end-lines, in

# all text files in and below the Examples directory

# (i.e., all source-code files).� Replace "tounix" with

# "todos" to convert any UNIX end-lines to DOS form

# instead.� This script only changes files that need to

# be changed, so it's safe to run brute-force from a

# root-level directory to force platform conformance.

# "python ..\..\Tools\fixeoln_all.py tounix *.txt"

# converts just .txt files (quote on UNIX: "*.txt").

# See also: fixeoln_one and fixeoln_dir versions.

#########################################################

import os, sys, string

debug��� = 0

pyfind�� = 0�����# force py find

listonly = 0

def findFiles(patts, debug=debug, pyfind=pyfind):

��� try:

������� if sys.platform[:3] == 'win' or pyfind:

����������� print 'Using Python find'

����������� import find������������������������������� # use python lib find.py

����������� matches = map(find.find, patts)����������� # start dir default = '.'

������� else:

����������� print 'Using find executable'

����������� matches = []

����������� for patt in patts:

��������������� findcmd = 'find . -name "%s" -print' % patt� # run find command

��������������� lines = os.popen(findcmd).readlines()������� # remove endlines

��������������� matches.append(map(string.strip, lines))���� # lambda x: x[:-1]

��� except:

������� assert 0, 'Sorry - cannot find files'

��� if debug: print matches

��� return matches

if __name__ == '__main__':

��� from fixeoln_dir import patts

��� from fixeoln_one import convertEndlines

��� errmsg = 'Required first argument missing: "todos" or "tounix"'

��� assert (len(sys.argv) >= 2 and sys.argv[1] in ['todos', 'tounix']), errmsg

��� if len(sys.argv) > 2:����������������� # quote in unix shell

������� patts = sys.argv[2:]�������������� # else tries to expand

��� matches = findFiles(patts)

��� count = 0

��� for matchlist in matches:���������������� # a list of lists

������� for fname in matchlist:�������������� # one per pattern

����������� print count+1, '=>', fname

����������� if not listonly:�

��������������� convertEndlines(sys.argv[1], fname)

����������� count = count + 1

��� print 'Converted %d files' % count

Lab Session 8

Click here to go to lab exercises

Click here to go to exercise solutions

Click here to go to solution source files

Click here to go to lecture example files