[Python-Dev] Distribution tools: What I would like to see (original) (raw)
Talin talin at acm.org
Sun Nov 26 21:24:03 CET 2006
- Previous message: [Python-Dev] Python and the Linux Standard Base (LSB)
- Next message: [Python-Dev] Distribution tools: What I would like to see
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
I've been looking once again over the docs for distutils and setuptools, and thinking to myself "this seems a lot more complicated than it ought to be".
Before I get into detail, however, I want to explain carefully the scope of my critique - in particular, why I am talking about setuptools on the python-dev list. You see, in my mind, the process of assembling, distributing, and downloading a package is, or at least ought to be, a unified process. It ought to be a fundamental part of the system, and not split into separate tools with separate docs that have to be mentally assembled in order to understand it.
Moreover, setuptools is the defacto standard these days - a novice programmer who googles for 'python install tools' will encounter setuptools long before they learn about distutils; and if you read the various mailing lists and blogs, you'll sense a subtle aura of deprecation and decay that surrounds distutils.
I would claim, then, that regardless of whether setuptools is officially blessed or not, it is an intrinstic part of the "Python experience".
(I'd also like to put forward the disclaimer that there are probably factual errors in this post, or errors of misunderstanding; All I can claim as an excuse is that it's not for lack of trying, and corrections are welcome as always.)
Think about the idea of module distribution from a pedagogical standpoint - when does a newbie Python programmer start learning about module distribution and what do they learn first? A novice Python user will begin by writing scripts for themselves, and not thinking about distribution at all. However, once they reach the point where they begin to think about packaging up their module, the Python documentation ought to be able to lead them, step by step, towards a goal of making a distributable package:
-- It should teach them how to organize their code into packages and modules -- It should show them how to write the proper setup scripts -- If there is C code involved, it should explain how that fits into the picture. -- It should explain how to write unit tests and where they should go.
So how does the current system fail in this regard? The docs for each component - distutils, setuptools, unit test frameworks, and so on, only talk about that specific module - not how it all fits together.
For example, the docs for distutils start by telling you how to build a setup script. It never explains why you need a setup script, or why Python programs need to be "installed" in the first place. [1]
The distutils docs never describe how your directory structure ought to look. In fact, they never tell you how to write a distributable package; rather, it seems to be more oriented towards taking an already-working package and modifying it to be distributable.
The setuptools docs are even worse in this regard. If you look carefully at the docs for setuptools, you'll notice that each subsection is effectively a 'diff', describing how setuputils is different from distutils. One section talks about the "new and changed keywords", without explaining what the old keywords were or how to find them.
Thus, for the novice programmer, learning how to write a setup script ends up being a process of flipping back and forth between the distutils and setuptools docs, trying to hold in their minds enough of each to be able to achieve some sort of understanding.
What we have now does a good job of explaining how the individual tools work, but it doesn't do a good job of answering the question "Starting from an empty directory, how do I create a distributable Python package?" A novice programmer wants to know what to create first, what to create next, and so on.
This is especially true if the novice programmer is creating an extension module. Suppose I have a C library that I need to wrap. In order to even compile and test it, I'm going to need a setup script. That means I need to understand distutils before I even think about distribution, before I even begin writing the code!
(Sure, I could write a Makefile, but I'd only end up throwing it away later -- so why not cut to the chase and start with a setup script? Ans: Because it's too hard!)
But it isn't just the docs that are at fault here - otherwise, I'd be posting this on a different mailing list. It seems like the whole architecture is 'diff'-based, a series of patches on top of patches, which are in need of some serious refactoring.
Except that nobody can do this refactoring, because there's no formal list of requirements. I look at distutils, and while some parts are obvious, there are other parts where I go "what problem were they trying to solve here?" In my experience, you don't go mucking with someone's code and trying to fix it unless you understand what problem they were trying to solve - otherwise you'll botch it and make a mess. Since few people ever bother to write down what problem they were trying to solve (although they tend to be better at describing their clever solution), usually this ends up being done through a process of reverse engineering the requirements from the code, unless you are lucky enough to have someone around who knows the history of the thing.
Admittedly, I'm somewhat in ignorance here. My perspective is that of an 'end-user developer', someone who uses these tools but does not write them. I don't know the internals of these tools, nor do I particularly want to - I've got bigger fish to fry.
I'm posting this here because what I'd like folks to think about is the whole process of Python development, not just the documentation. What is the smoothest path from empty directory to a finished package on PyPI? What can be changed about the current standard libraries that will ease this process?
[1] The answer, AFAICT, is that 'setup' is really a Makefile - in other words, its a platform-independent way of describing how to construct a compiled module from sources, and making it available to all programs on that system. Although this gets confusing when we start talking about "pure python" modules that have no C component - because we have all this language that talks about compiling and installing and such, when all that is really going on underneath is a plain old file copy.
-- Talin
- Previous message: [Python-Dev] Python and the Linux Standard Base (LSB)
- Next message: [Python-Dev] Distribution tools: What I would like to see
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]