[Python-Dev] DRAFT: python-dev summary for 2006-10-16 to 2006-10-31 (original) (raw)

Steven Bethard steven.bethard at gmail.com
Wed Nov 22 20:48:48 CET 2006


Here's the summary for the second half of October. Comments and corrections welcome as always, especially on that extended buffer protocol / binary format specifier discussion which was a little overwhelming. ;-)

============= Announcements


Roundup to replace SourceForge tracker

Roundup_ has been named as the official replacement for the SourceForge_ issue tracker. Thanks go out to the new volunteer admins, Paul DuBois, Michael Twomey, Stefan Seefeld, and Erik Forsberg, and also to Upfront Systems_ who will be hosting the tracker. If you'd like to provide input on what the new tracker should do, please join the tracker-discuss mailing list_.

.. _SourceForge: http://www.sourceforge.net/ .. _Roundup: http://roundup.sourceforge.net/ .. _Upfront Systems: http://www.upfrontsystems.co.za/ .. _tracker-discuss mailing list: http://mail.python.org/mailman/listinfo/tracker-discuss

Contributing threads:

========= Summaries


The buffer protocol and communicating binary format information

Travis E. Oliphant presented a pre-PEP for adding a standard way to describe the shape and intended types of binary-formatted data. It was accompanied by a pre-PEP for extending the buffer protocol to handle such shapes and types. Under the proposal, a new datatype object would describe binary-formatted data with an API like::

datatype((float, (3,2))
# describes a 3*2*8=48 byte block of memory that should be interpreted
# as 6 doubles laid out as arr[0,0], arr[0,1], ... a[2,0], a[1,2]

datatype([( ([1,2],'coords'), 'f4', (3,6)), ('address', 'S30')])
# describes the structure
#     float coords[3*6]   /* Has [1,2] associated with this field */
#     char  address[30]

Alexander Belopolsky provided a nice example of why you might want to extend the buffer protocol along these lines. Currently, there's not much you can do with a basic buffer object. If you want to pass it to numpy_, you have to provide the type and shape information yourself::

>>> b = buffer(array('d', [1,2,3]))
>>> numpy.ndarray(shape=(3,), dtype=float, buffer=b)
array([ 1.,  2.,  3.])

By extending the buffer protocol appropriately so that the necessary information can be provided, you should be able to pass the buffer directly to numpy_ and have it understand the format itself::

>>> numpy.array(b)

People were uncomfortable with the many datatype variants -- the constructor accepted types, strings, lists or dicts, each of which could specify the structure in a different way. Also, a number of people questioned why the existing ctypes mechanisms for describing binary data couldn't be used instead, particularly since ctypes could already describe things like function pointers and recursive types, which the pre-PEP could not. Travis said he was looking for a way to unify the data formats of all the array, struct, numpy and ctypes modules, and felt like using the ctypes approach was too verbose for use in the other modules. In particular, he felt like the ctypes use of type objects as binary-format specifiers was problematic because type objects were harder to manipulate at the C level.

The discussion continued on into the next fortnight.

.. _numpy:

Contributing threads:


The "lazy strings" patch

Discussion continued on Larry Hastings lazy strings patch_ that would have delayed until necessary the evaluation of some string operations, like concatenation and slicing. With his patch, repeated string concatenation could be used instead of the standard .join() idiom, and slices which were never used would never be rendered. Discussions of the patch showed that people were concerned about memory increases when a small slice of a very large string kept the large string around in memory. People also felt like a stronger motivation was necessary to justify complicating the string representation so much. Larry was pointed to some code that his patch would break, which was using ob_sval directly instead of calling PyString_AS_STRING() like it was supposed to. He was also referred to the Python 3000 list where the recent discussions of string views_ would be relevant, and his proposal might have a better chance of acceptance.

.. _lazy strings patch: http://bugs.python.org/1569040 .. _code that his patch would break: http://www.google.com/codesearch?hl=en&lr=&q=ob_sval+-stringobject.%5Bhc%5D&btnG=Search .. _Python 3000 list: http://mail.python.org/mailman/listinfo/python-3000 .. _string views: http://mail.python.org/pipermail/python-3000/2006-August/003280.html

Contributing threads:


PEP 355 status

BJörn Lindqvist wanted to wrap up the loose ends of PEP 355_ and asked whether the problem was the specific path object of PEP 355_ or path objects in general. A number of people felt that some reorganization of the path-related functions could be helpful, but that trying to put everything into a single object was a mistake. Some important requirements for a reorganization of the path-related functions:

There were a few suggestions of possible new APIs, but no concrete implementations. People seemed hopeful that the issue could be resurrected for Python 3K, but no one appeared to be taking the lead.

.. _PEP 355: http://www.python.org/dev/peps/pep-0355/

Contributing thread:


Buildbots, configure changes and extension modules

Grig Gheorghiu, who's been taking care of the Python Community Buildbots_, noticed that the buildbots started failing after a checkin that made changes to configure. Martin v. Löwis explained that even though a plain make will trigger a re-run of configure if it has changed, there is an issue with distutils not rebuilding when header files change, and so extension modules are sometimes not rebuilt. Contributions to fix that deficiency in distutils are welcome.

Martin also pointed out a handy way of forcing a buildbot to start with a clean build: ask the buildbot to build a non-existing branch. This causes the checkouts to be deleted and the build to fail. The next regular build will then start from scratch.

.. _Python Community Buildbots: http://www.pybots.org/

Contributing thread:


Sqlite versions

Skip Montanaro ran into some problems running test_sqlite on OSX where he was getting a bunch of ProgrammingError: library routine called out of sequence errors. These errors appeared reliably when test_sqlite was run immediately after ctypes' test_find. When he started linking to sqlite 3.1.3 instead of sqlite 3.3.8, the problems went away. Barry Warsaw mentioned that he had run into similar troubles when he tried to upgrade from 3.2.1 to 3.2.8.

Contributing thread:


Threads, generators, exceptions and segfaults

Mike Klaas managed to provoke a segfault_ in Python 2.5 using threads, generators and exceptions. Tim Peters was able to whittle Mike's problem down to a relatively simple test case, where a generator was created within a thread, and then the thread vanished before the generator had exited. The segfault was a result of Python's attempt to clean up the abandoned generator, during which it tried to access the generator's already free()'d thread state. No clear solution to this problem had been decided on at the time of this summary.

.. _provoke a segfault: http://bugs.python.org/1579370

Contributing thread:


ctypes and win64

Previously, Thomas Heller had asked that ctypes be removed from the Python 2.5 win64 MSI installers since it did not work for that platform at the time. Since then, Thomas integrated some patches in the trunk so that _ctypes could be built for win64/AMD64. Backporting these fixes to Python 2.5 would have meant that, while the MSI installer would still not include it, _ctypes could be built from a source distribution on win64/AMD64. It was unclear whether this would constitute a bugfix (in which case the backport would be okay) or a feature (in which case it wouldn't).

Contributing thread:


Python 2.3.X and 2.4.X retired

Anthony Baxter pushed out a Python 2.4.4 release and was pushing out the Python 2.3.6 source release as well. He indicated that once 2.3.6 was out, both of these branches could be officially retired.

Contributing thread:


Producing bytecode from Python 2.5 ASTs

Michael Spencer offered up his compiler2_ module, a rewrite of the compiler module which allows bytecode to be produced from _ast.AST objects. Currently, it produces almost identical output to __builtin__.compile for all the stdlib modules and their tests. He asked for feedback on what would be necessary to get it stdlib ready, but had no responses.

.. _compiler2: http://svn.brownspencer.com/pycompiler/branches/new_ast/

Contributing thread:

================== Previous Summaries

=============== Skipped Threads



More information about the Python-Dev mailing list