Python programming language (original) (raw)

Python is an interpreted, interactive programming language created by Guido van Rossum, originally as a scripting language for Amoeba OS capable of making system calls. Python is often compared to Tcl, Perl, Scheme, Java and Ruby. Python is currently (December 2003) at version 2.3.3.

Philosophy

Python is a multi-paradigm language, like Perl and unlike Smalltalk or Haskell. This means that, rather than forcing coders to adopt one particular style of coding, it permits several. Object orientation, structured programming, functional programming, and more recently, design by contract are all supported. Python is dynamically type-checked and uses garbage collection for memory management. The term "agile programming language" has recently come into vogue, applied mostly to Python.

Popularized explicitly in contrast to Perl, Python has many similarities to that language. However, Python's designers reject Perl's exuberant syntax in favor of a more spare, less cluttered one. As with Perl, Python's developers expressly promote a particular "culture" or ideology based on what they want the language to be, favoring language forms they see as "beautiful", "explicit" and "simple". For the most part, Perl and Python users differ in their interpretation of these terms and how they are best implemented.

Another important goal of the Python developers is to make using Python fun. This is reflected in the origin of the name (after the television series Monty Python's Flying Circus), in the common practice of using Monty Python references in example code, and a generally not over-serious viewpoint adopted in many Python tutorials and reference materials.

Despite these populist goals, and although—again as with Perl—Python is sometimes classed as a "scripting language", it has been used to develop many large software projects such as the Zope application server and the Mnet file sharing system. It is also extensively used in Google. Although Python does use scripts, proponents prefer to call it an interpreted language, on the grounds that "scripting language" tends to imply things like shell scripts or JavaScript: much simpler and, for most purposes, less capable than "real" programming languages such as Python.

Another important goal of the language is ease of extensibility. New built-in modules are easily written in C or C++. Python is also usable as an extension language for existing modules and applications that need a programmable interface.

Though the designer of Python is somewhat hostile to functional programming and the Lisp tradition, there are significant parallels between the philosophy of Python and that of minimalist Lisp-family languages such as Scheme. Many past Lisp programmers have found Python appealing for this reason.

Data types and structures

Python has a broad range of basic data types. Alongside conventional integer and floating point arithmetic, it transparently supports arbitrarily large integers and complex numbers.

It supports the usual panoply of string operations, with one exception: strings in Python are immutable objects, so any string operation that might elsewhere alter a string (such as a substitution of characters) will in Python instead return a new string.

Python values, not variables, carry type—meaning that Python is a dynamically typed language, like Lisp and unlike Java or C. All values are passed by reference, not by value.

Among dynamically typed languages, Python is moderately type-checked. It is neither as loose as Perl nor as strict as Caml. Implicit conversion is defined for numeric types, so one may validly multiply a complex number by a long integer (for instance) without explicit casting. However, there is no implicit conversion between (e.g.) numbers and strings; unlike in Perl, a number is an invalid argument to a string operation.

Collection types

Python also has several collection types, including lists, tuples, and dictionaries. Lists, tuples, and strings are sequences and share most of their methods in common: one can iterate over the characters of a string as easily as the elements of a list. Lists are extensible arrays, whereas tuples are of fixed length and immutable.

The purpose for all this immutability comes in with dictionaries, a type known elsewhere as hashes, associative arrays, or maps. To preserve consistency under pass-by-reference, the keys of a dictionary must be of immutable type. Dictionary values, on the other hand, may be of any type.

Object system

The Python type system is well integrated with the class system. Although the built-in data types are not precisely classes, a class can inherit from a type. Thus it is possible to extend strings or dictionaries ... or even integers, should you care to do such a thing. Python also supports multiple inheritance.

The language supports extensive introspection of types and classes. Types can be read and compared—indeed, as in Smalltalk, types are a type. The attributes of an object can be extracted as a dictionary.

Operators can be overloaded in Python by defining special member functions—for instance, defining __add__ on a class permits one to use the + operator on members of that class. (Compare C++'s operator+ and similar method names.)

Syntax

Python was designed to be highly readable. It has a simple visual layout, uses English keywords frequently where other languages use punctuation, and has notably fewer syntactic constructions than many structured languages such as C, Perl, or Pascal.

For instance, Python has only two structured loop forms—for, which loops over elements of a list or iterator (like Perl foreach); and while, which loops as long as a boolean expression is true. It thus lacks C-style complex for, a do...while, and Perl's until, though of course equivalents can be expressed. Likewise, it has only if...elif...else for branching—no switch or labeled goto.

Syntactical significance of whitespace

One unusual aspect of Python's syntax is the method used to delimit program blocks. Sometimes termed "the whitespace thing", it is one aspect of Python syntax that many programmers otherwise unfamiliar with Python have heard of, since it is unique among currently widespread languages.

In languages that use the block structure ultimately derived from Algol—including Pascal, C, Perl, and many others—blocks of code are set off with braces ({ }) or keywords such as Pascal's begin and end. In all these languages, however, programmers conventionally indent the code within a block, to set it off visually from the surrounding code.

Python, instead, borrows a feature from the lesser-known language ABC—instead of punctuation or keywords, it uses this indentation itself to indicate the run of a block. A brief example will make this clear. Here are C and Python recursive functions which do the same thing—computing the factorial of an integer:

Factorial function in C:

int factorial(int x) {
if (x == 0) {
return(1);
} else { return(x * factorial(x-1)); } }

Factorial function in Python:

def factorial(x): if x == 0: return 1 else: return x * factorial(x-1)

Some programmers used to Algol-style languages, in which whitespace is semantically empty, at first find this confusing or even offensive. A few have drawn unflattering comparison to the column-oriented style used on punched-card Fortran systems. When Algol was new, it was a major development to have "free-form" languages in which only symbols mattered and not their position on the line.

To Python programmers, however, "the whitespace thing" is simply an extrapolation of a convention that programmers in Algol-style languages already follow anyway. They also point out that the free-form syntax has the disadvantage that, since indentation is ignored, good indentation cannot be enforced. Thus, incorrectly indented code may be misleading, since a human reader and a compiler will interpret it differently.

Functional programming

As mentioned above, another strength of Python is the availability of functional syntax elements. As may be expected, these make working with lists and other collections much more straightforward. One such construction is the list comprehension, introduced from the functional language Haskell, as seen here in calculating the first five powers of two:

numbers = [1, 2, 3, 4, 5] powers_of_two = [2**n for n in numbers]

Because Python permits functions as arguments, it is also possible to express more subtle functional constructs, such as the continuation.

Lambda

Python's lambda keyword may misdirect some functional-programming fans. Python lambda blocks may contain only expressions, not statements. Thus, they are not the most general way to return a function for use in higher-order functions. Instead, the usual practice is to define and return a function using a locally-scoped name, as in the following example of a simple curried function:

def add_and_print_maker(x): def temp(y): print "%d + %d = %d" % (x, y, x+y) return temp

The function can also be implemented with nested lambdas, as would be done in Scheme. To do this requires working around the Python lambda's limitation, by defining a function to encapsulate the print statement:

def print_func(obj): print obj

add_and_print_maker = \ lambda(x): lambda(y): \ print_func("%d + %d = %d" % (x, y, x+y))

The resulting add_and_print_maker functions perform identically: given a number x they return a function which, when given a number y, will print a sentence of arithmetic. Although the first style may be more common, the second can be more clear to programmers with a functional-programming background.

Python's unique style for the binary boolean operators and and or create another unique functional feature. Using those two operators, any type of control flow can be implemented within lambda expressions [1]. They are usually used for simpler purposes, however. See the heading [logical operators](#Logical operators) below.

Generators

Introduced in Python 2.2 as optional feature and finalized in version 2.3, generators are Python's mechanism for lazy evaluation of a function that would otherwise return a long or computationally intensive list. The uses of generators are similar to the uses of Scheme streams.

One example from the python.org website:

def generate_ints(N): for i in range(N): yield i

You can now use the generator generate_ints:

for i in generate_ints(N): print i

Note that the variable N should be defined before executing the second piece of code.

The definition of a generator appears identical to that of a function, except the keyword yield is used in place of return. However, a generator is an object with persistent state, which can repeatedly enter and leave the same dynamic extent. A generator call can then be used in place of a list, or other structure whose elements will be iterated over. Whenever the _for_-loop in the example requires the next item, the generator is called, and yields the next item.

Logical operators

In Python, the expressions "", 0, None, [], {}, etc. are false, and everything else is true. When using binary boolean operators in Python, the syntax is to have the operator be in between the two statements in question. So to see if the statements x x">

5 and 3 are true, one would write "x

5 and 3". To evaluate this, the interpreter would first check if x ``

5 returned true. If it didn't, it would return 0, but since it did, it goes on to the next statement. Next, it checks if 3 is true. Since 3 is true, 3 is returned. If three weren't true, 0 would be returned. If the order of all of this were reversed to 3 and x

5, 1 would be returned because that's what x==5 evaluates to (because 1 is the default truth value). The or function works similarly. To find out if "2/3 or 5" is true, the interperater first finds the truth value of 2/3. Since 2/3 evaluates to 0, as described above, it would return false. If it had returned true, then its value would be returned. Next, the interpreter looks at the second expression. Since, in this case, it returns true, 5 would be returned. It is common in Python to write expressions such as print p or q to take advantage of this feature.

Object-oriented programming

Python has inheritance, including multiple inheritance. It has limited support for private variables using name mangling. See the "Classes" section of the tutorial for details. Many Python users don't feel the need for private variables, though. The slogan "We're all consenting adults here" is used to describe this attitude. Some consider information hiding to be unpythonic, in that it suggests that the class in question contains unaesthetic or ill-planned internals.

From the tutorial: As is true for modules, classes in Python do not put an absolute barrier between definition and user, but rather rely on the politeness of the user not to "break into the definition."

OOP doctrines such as the use of accessor methods to read data members are not enforced in Python. Just as Python offers functional-programming constructs but does not attempt to demand referential transparency (in contrast with Haskell), it offers (and extensively uses!) its object system but does not demand OOP behavior (in contrast with Java or Smalltalk).

In version 2.2 of Python, "new-style" classes were introduced. With new-style classes, objects and types were unified, allowing the subclassing of types. Even new types entirely can be defined, complete with custom behavior for infix operators. This allows for many radical things to be done syntactically within Python, such as the ability to use C++-style input and output. A new multiple inheritance model was adopted with new-style classes, making a much more logical order of inheritance, adopted from Common Lisp. The new methods and classes property, __getattribute__ and __setattribute__ were also defined to assist with the handling of variables.

Exception handling

Python supports (and extensively uses) exception handling as a means of testing for error conditions. Indeed, it is even possible to trap the exception caused by a syntax error!

Exceptions permit more concise and reliable error checking than many other ways of reporting erroneous or exceptional events. Exceptions are thread-safe; they tend not to clutter up code in the way that testing for returned error codes does in C; and they can easily propagate up the calling stack when an error must be reported to a higher level of the program.

Python style calls for the use of exceptions whenever an error condition might arise. Indeed, rather than testing for access to a file or resource before actually using it, it is conventional in Python to just go ahead and try to use it, catching the exception if access is rejected.

Standard library

Python has a large standard library, which makes it well suited to many tasks. This comes from a so-called "batteries included" philosophy for python modules. The modules of the standard library can be augmented with custom modules written in either C or Python. The standard library is particularly well tailored to writing Internet-facing applications, with a large number of standard formats and protocols (such as MIME and HTTP) supported. Modules for creating graphical user interfaces, connecting to relational databases, and manipulating regular expressions are also included.

The standard library is one of Python's greatest strengths. The bulk of it is cross-platform compatible, meaning that even heavily leveraged Python programs can often run on Unix, Windows, Macintosh, and other platforms without change.

It is currently being debated whether or not third-party but open source Python modules such as wxPython or NumPy should be included in the standard library, in accordance with the batteries included philosophy.

Other features

Like Lisp, and unlike Perl, the Python interpreter also supports an interactive mode in which expressions can be entered from the terminal and results seen immediately. This is a boon for those learning the language and experienced developers alike: snippets of code can be tested in interactive mode before integrating them into a program proper.

Python also includes a unit testing framework for creating exhaustive test suites. While static-typing aficionados see this as a replacement for a static type-checking system, Python programmers largely do not share this view.

Neologisms

A few neologisms have come into common use within the Python community. One of the most common is "pythonic", which can have a wide range of meanings related to program style. To say that a piece of code is pythonic is to say that it uses Python idioms well; that it is natural or shows fluency in the language. Likewise, to say of an interface or language feature that it is pythonic is to say that it works well with Python idioms; that its use meshes well with the rest of the language.

In contrast, a mark of unpythonic code is that it attempts to "write C++ (or Lisp, or Perl) code in Python"—that is, provides a rough transcription rather than an idiomatic translation of forms from another language.

The prefix Py- can be used to show that something is related to Python, much as a prefixed J- denotes Java. Examples of the use of this prefix in names of Python applications or libraries include PyGame, a binding of SDL to Python, PyUI, a GUI encoded entirely in Python, and PyAlaMode, an IDE for Python created by Orbtech, a company specializing in Python.

Platforms

Although Python was originally programmed for the Amoeba platform, that version is "dead" (ie. it hasn't been updated in a while). The three most popular (and therefore best maintained) platforms Python runs on are Windows, Linux, Mac OS X and Java. The Java version is a completely separate implementation which supports compilation. The Mac port is maintained by an external project called MacPython, and was included in Mac OS 10.3 "Panther". Other supported platforms include:

Unfortunately, most of the third-party libraries for Python (and even some first-party ones) are only available on Windows, Linux, and Mac OS X.

Miscellany