[Python-Dev] DRAFT: python-dev summary for 2006-11-01 to 2006-11-15 (original) (raw)

Steven Bethard steven.bethard at gmail.com
Thu Nov 23 07:48:44 CET 2006


Here's the summary for the first half of November. Try not to spend it all in one place! ;-)

As always, corrections and comments are greatly appreciated.

============= Announcements


Python 2.5 malloc families

Just a reminder that if you find your extension module is crashing with Python 2.5 in malloc/free, there is a high chance that you have a mismatch in malloc "families". Unlike previous versions, Python 2.5 no longer allows sloppiness here -- if you allocate with the PyMem_* functions, you must free with the PyMem_* functions, and similarly, if you allocate with the PyObject_* functions, you must free with the PyObject_* functions.

Contributing thread:

========= Summaries


Mike Orr started work on a replacement for PEP 355_ that would better group the path-related functions currently in os, os.path, shutil and other modules. He proposed to start with a directory-tuple Path class_ that would have allowed code like::

# equivalent to
# os.path.join(os.path.dirname(os.path.dirname(__FILE__)), "lib")
os.path.Path(__FILE__)[:-2] + "lib"

where a Path object would act like a tuple of directories, and could be easily sliced and reordered as such.

As an alternative, glyph proposed using Twisted's filepath module_ which was already being used in a large body of code. He showed some common pitfalls, like that the existence on Windows of "CON" and "NUL" in every directory can make paths invalid, and indicated how FilePath solved these problems.

Fredrik Lundh suggested a reorganization where functions that manipulate path names would reside in os.path, and functions that manipulate objects identified by a path would reside in os. The os.path module would gain a path wrapper object, which would allow "path algebra" manipulations, e.g. path1 + path2. The os module would gain some of the os.path and shutil functions that were manipulating real filesystem objects and not just the path names. Most people seemed to like this approach, because it correctly targeted the "algebraic" features at the areas where chained operations were most common: path name operations, not filesystem operations.

Some of the conversation moved on to the Python 3000 list_.

.. _PEP 355: http://www.python.org/dev/peps/pep-0355/ .. _directory-tuple Path class: http://wiki.python.org/moin/AlternativePathClass .. _Twisted's filepath module: http://twistedmatrix.com/trac/browser/trunk/twisted/python/filepath.py .. _Python 3000 list: http://mail.python.org/mailman/listinfo/python-3000

Contributing threads:


Replacing urlparse

A few more bugs in urlparse were turned up, and earlier discussions about replacing urlparse_ were briefly revisited. Paul Jimenez asked about uriparse module_ and was told that due to the constant problems with urlparse, people were concerned about including the "incorrect" library again, so requirements were a little stringent. Martin v. Löwis gave him some guidance on a few specific points, and Nick Coghlan promised to try to post his urischemes module_ (a derivative of Paul's uriparse module) to the Python Package Index.

.. _earlier discussions about replacing urlparse: http://www.python.org/dev/summary/2006-06-01_2006-06-15/#rfc-3986-uniform-resource-identifiers-uris .. _uriparse module: http://bugs.python.org/1462525 .. _urischemes module: http://bugs.python.org/1500504 .. _Python Package Index: http://www.python.org/pypi

Contributing threads:


Importing .py, .pyc and .pyo files

Martin v. Löwis brought up Osvaldo Santana's patch_ which would have made Python search for both .pyc and .pyo files regardless of whether or not the optimize flag, "-OO", was set (like zipimporter does). Without this patch, when "-OO" was given, Python never looked for .pyc files. Some people thought that an extra stat() call or directory listing to check for the other file would be too expensive, but no one profiled the various versions of the code so the cost was unclear. People were leaning towards removing the extra functionality from zipimporter so that at least it was consistent with the rest of Python.

Giovanni Bajo suggested that .pyo file support should be dropped completely, with .pyc files being compiled at various levels of optimization depending on the command line flags. To make sure all your .pyc files were compiled at the same level of optimization, you'd use a new "-I" flag to indicate that all files should be recompiled, e.g. python -I -OO app.py.

Armin Rigo suggested only loading files with a .py extension. Python would still generate .pyc files as a means of caching bytecode for speed reasons, but it would never import them without a corresponding .py file around. For people wanting to ship just bytecode, the cached .pyc files could be renamed to .py files and then those could be shipped and imported.

There was some support for Armin's solution, but it was not overwhelming.

.. _Osvaldo Santana's patch: http://bugs.python.org/1346572

Contributing thread:


The buffer protocol and communicating binary format information

The discussion of extending the buffer protocol to more binary formats continued this fortnight. Though the PIL_ had been used as an example of a library that could benefit from an extended buffer protocol, Fredrik Lundh indicated that future versions of the PIL_ would make the binary data model completely opaque, and instead provide a view-style API like::

view = object.acquire_view(region, supported formats)
... access data in view ...
view.release()

Along these lines, the discussion turned away from the particular C formats used in ctypes, numpy, array, etc. and more towards the best way to communicate format information between these modules. Though it seemed like people were not completely happy with the proposed API of the new buffer protocol, the discussion seemed to skirt around any concrete suggestions for better APIs.

In the end, the only thing that seemed certain was that a new buffer protocol could only be successful if it were implemented on all of the appropriate stdlib modules: ctypes, array, struct, etc.

.. _PIL: http://www.pythonware.com/products/pil/

Contributing threads:


dir, part 2

Tomer Filiba continued his previous investigations_ into adding a __dir__() method to allow customization of the dir() builtin. He moved most of the current dir() logic into object.__dir__(), with some additional logic necessary for modules and types being moved to ModuleType.__dir__() and type.__dir__() respectively. He posted a patch for his implementation_ and it got approval for Python 2.6.

There was a brief discussion about whether or not it was okay for an object to lie about its members, with Fredrik Lundh suggesting that you should only be allowed to add to the result that dir() produces. Nick Coghlan pointed out that when a class overrides __getattribute__(), attributes that the default dir() implementation sees can be blocked, in which case removing members from the result of dir() might be quite appropriate.

.. _previous investigations: http://www.python.org/dev/summary/2006-07-01_2006-07-15/#adding-a-dir-magic-method .. _patch for his implementation: http://bugs.python.org/1591665

Contributing thread:


Invalid read errors and valgrind

Using valgrind, Herman Geza found that he was getting some "Invalid read" read errors in PyObject_Free which weren't identified as acceptable in Misc/README.valgrind. Tim Peters and Martin v. Löwis explained that these are okay if they are reads from Py_ADDRESS_IN_RANGE. If the address given is Python's own memory, a valid arena index is read. Otherwise, garbage is read (though this read will never fail since Python always reads from the page where the about-to-be-freed block is located). The arenas are then checked to see whether the result was garbage or not.

Neal Norwitz promised to try to update Misc/README.valgrind with this information.

Contributing thread:


SCons and cross-compilation

Martin v. Löwis reviewed a patch for cross-compilation_ which proposed to use SCons_ instead of distutils because updating distutils to work for cross-compilation would have involved some fairly major changes. Distutils had certain notions of where to look for header files and how to invoke the compiler which were incorrect for cross-compilation, and which were difficult to change. While accepting the patch would not have required SCons_ to be added to Python proper (which a number of people opposed), people didn't like the idea of having to update SCons configuration in addition to already having to update setup.py, Modules/Setup and the PCbuild area. The patch was therefore rejected.

.. _patch for cross-compilation: http://bugs.python.org/841454 .. _SCons: http://www.scons.org/

Contributing thread:


Individual interpreter locks

Robert asked about having a separate lock for each interpreter instance instead of the global interpreter lock (GIL). Brett Cannon and Martin v. Löwis explained that a variety of objects are shared between interpreters, including:

A single lock for each interpreter would not be sufficient for handling access to such shared objects.

Contributing thread:


Passing floats to file.seek

Python's implementation of file.seek was converting floats to ints. Robert Church suggested a patch_ that would convert floats to long longs and thus support files larger than 2GiB. Martin v. Löwis proposed instead to use the __index__() API to support the large files and to raise an exception for float arguments. Martin's approach was approved, with a warning instead of an exception for Python 2.6.

.. _Robert Church suggested a patch: http://bugs.python.org/1067760

Contributing thread:


The datetime module and timezone objects

Fredrik Lundh asked about including a tzinfo object implementation for the datetime module, along the lines of the UTC, FixedOffset and LocalTimezone classes from the library reference_. A number of people reported having copied those classes into their own code repeatedly, and so Fredrik got the go-ahead to put them into Python 2.6.

.. _library reference: http://docs.python.org/lib/datetime-tzinfo.html

Contributing thread:

================ Deferred Threads

================== Previous Summaries

=============== Skipped Threads



More information about the Python-Dev mailing list