[Python-Dev] PEP 402: Simplified Package Layout and Partitioning (original) (raw)
Éric Araujo merwok at netwok.org
Thu Aug 11 16:39:34 CEST 2011
- Previous message: [Python-Dev] [PEPs] Rebooting PEP 394 (aka Support the /usr/bin/python2 symlink upstream)
- Next message: [Python-Dev] PEP 402: Simplified Package Layout and Partitioning
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi,
I’ve read PEP 402 and would like to offer comments.
I know a bit about the import system, but not down to the nitty-gritty details of PEP 302 and path computations and all this fun stuff (by which I mean, not fun at all). As such, I can’t find nasty issues in dark corners, but I can offer feedback as a user. I think it’s a very well-written explanation of a very useful feature: +1 from me. If it is accepted, the docs will certainly be much more concise, but the PEP as a thought process is a useful document to read.
When new users come to Python from other languages, they are often confused by Python's packaging semantics. Minor: I would reserve “packaging” for packaging/distribution/installation/deployment matters, not Python modules. I suggest “Python package semantics”.
On the negative side, however, it is non-intuitive for beginners, and requires a more complex step to turn a module into a package. If
Foo
begins its life asFoo.py
, then it must be moved and renamed toFoo/_init_.py
. Minor: In the UNIX world, or with version control tools, moving and renaming are the same one thing (hg mv spam.py spam/init.py for example). Also, if you turn a module into a package, you may want to move code around, change imports, etc., so I’m not sure the renaming part is such a big step. Anyway, if the import-sig people say that users think it’s a complex or costly operation, I can believe it.
(By the way, both of these additions to the import protocol (i.e. the dynamically-added
_path_
, and dynamically-created modules) apply recursively to child packages, using the parent package's_path_
in place ofsys.path
as a basis for generating a child_path_
. This means that self-contained and virtual packages can contain each other without limitation, with the caveat that if you put a virtual package inside a self-contained one, it's gonna have a really short_path_
!) I don’t understand the caveat or its implications.
In other words, we don't allow pure virtual packages to be imported directly, only modules and self-contained packages. (This is an acceptable limitation, because there is no functional value to importing such a package by itself. After all, the module object will have no contents until you import at least one of its subpackages or submodules!)
Once
zc.buildout
has been successfully imported, though, there will be azc
module insys.modules
, and trying to import it will of course succeed. We are only preventing an initial import from succeeding, in order to prevent false-positive import successes when clashing subdirectories are present onsys.path
. I find that limitation acceptable. After all, there is no zc project, and no zc module, just a zc namespace. I’ll just regret that it’s not possible to provide a module docstring to inform that this is a namespace package used for X and Y.
The resulting list (whether empty or not) is then stored in a
sys.virtualpackagepaths
dictionary, keyed by module name. This was probably said on import-sig, but here I go: yet another import artifact in the sys module! I hope we get ImportEngine in 3.3 to clean up all this.
* A new
extendvirtualpaths(pathentry)
function, to extend existing, already-imported virtual packages'_path_
attributes to include any portions found in a newsys.path
entry. This function should be called by applications extendingsys.path
at runtime, e.g. when adding a plugin directory or an egg to the path. Let’s imagine my application Spam has a namespace spam.ext for plugins. To use a custom directory where plugins are stored, or a zip file with plugins (I don’t use eggs, so let me talk about zip files here), I’d have to call sys.path.append and pkgutil.extend_virtual_paths?
*
ImpImporter.itermodules()
should be changed to also detect and yield the names of modules found in virtual packages. Is there any value in providing an argument to get the pre-PEP behavior? Or to look at it from a different place, how can Python code know that some module is a virtual or pure virtual package, if that is even a useful thing to know?
Last, but not least, the
imp
module (orimportlib
, if appropriate) should expose the algorithm described in thevirtual_ _paths
section above, as agetvirtualpath(modulename, parentpath=None)
function, so that creators of_import_
replacements can use it. If I’m not mistaken, the rule of thumb these days is that imp is edited when it’s absolutely necessary, otherwise code goes into importlib (more easily written, read and maintained).
I wonder if importlib.import_module could implement the new import semantics all by itself, so that we can benefit from this PEP in older Pythons (importlib is on PyPI).
* If you are changing a currently self-contained package into a virtual one, it's important to note that you can no longer use its
_file_
attribute to locate data files stored in a package directory. Instead, you must search_path_
or use the_file_
of a submodule adjacent to the desired files, or of a self-contained subpackage that contains the desired files. Wouldn’t pkgutil.get_data help here?
Besides, putting data files in a Python package is held very poorly by some (mostly people following the File Hierarchy Standard), and in distutils2/packaging, we (will) have a resources system that’s as convenient for users and more flexible for OS packagers. Using file for more than information on the module is frowned upon for other reasons anyway (I talked about a Debian developer about this one day but forgot), so I think the limitation is okay.
* XXX what is the file of a "pure virtual" package?
None
? Some arbitrary string? The path of the first directory with a trailing separator? No matter what we put, some code is going to break, but the last choice might allow some code to accidentally work. Is that good or bad? A pure virtual package having no source file, I think it should have no file at all. I don’t know if that would break more code than using an empty string for example, but it feels righter.
For those implementing PEP \302 importer objects: Minor: Here I think a link would not be a nuisance (IOW remove the backslash).
Regards
- Previous message: [Python-Dev] [PEPs] Rebooting PEP 394 (aka Support the /usr/bin/python2 symlink upstream)
- Next message: [Python-Dev] PEP 402: Simplified Package Layout and Partitioning
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]