[Python-Dev] Argument Clinic: what to do with builtins with non-standard signatures? (original) (raw)
Larry Hastings larry at hastings.org
Fri Jan 24 16:07:47 CET 2014
- Previous message: [Python-Dev] Python 3 marketing document?
- Next message: [Python-Dev] Argument Clinic: what to do with builtins with non-standard signatures?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
BACKGROUND (skippable if you're a know-it-all)
Argument parsing for Python functions follows some very strict rules. Unless the function implements its own parsing like so:
def black_box(*args, **kwargs):
there are some semantics that are always true. For example:
* Any parameter that has a default value is optional, and vice-versa.
* It doesn't matter whether you pass in a parameter by name or by
position, it behaves the same.
* You can see the default values by examining its inspect.Signature.
* Calling a function and passing in the default value for a parameter
is identical to calling the function without that parameter.
e.g. (assuming foo is a pure function):
def foo(a=value): ...
foo() == foo(value) == foo(a=value)
With that signature, foo() literally can't tell the difference
between those three calls. And it doesn't matter what the type
of value is or where you got it.
Python builtins are a little less regular. They effectively do their own parsing. So they could do any crazy thing they want. 99.9% of the time they do one of four standard things:
- They parse their arguments with a single call to PyArg_ParseTuple().
- They parse their arguments with a single call to PyArg_ParseTupleAndKeywords().
- They take a single argument of type "object" (METH_O).
- They take no arguments (METH_NOARGS).
PyArg_ParseTupleAndKeywords() behaves almost exactly like a Python function. PyArg_ParseTuple() is a little less like a Python function, because it doesn't support keyword arguments. (Surely this behavior is familiar to you!)
But then there's that funny 0.1%, the builtins that came up with their own unique approach for parsing arguments--given them funny semantics. Argument Clinic tries to accomodate these as best it can. (That's why it supports "optional groups" for example.) But it can only do so much.
THE PROBLEM
Argument Clinic's original goal was to provide an introspection signature for every builtin in Python.
But a small percentage of builtins have funny semantics that aren't expressable in a valid Python signature. This makes them hard to convert to Argument Clinic, and makes their signature inaccurate.
If we want these functions to have an accurate Python introspection signature, their argument parsing will have to change.
THE QUESTION
What should someone converting functions to Argument Clinic do when faced with one of these functions?
Of course, the simplest answer is "nothing"--don't convert the function to Argument Clinic. We're in beta, and any change that isn't a bugfix is impermissible. We can try again for 3.5.
But if "any change" is impermissible, then we wouldn't have the community support to convert to Argument Clinic right now. The community wants proper signatures for builtins badly enough that we're doing it now, even though we're already in beta for Python 3.4. Converting to Argument Clinic is, in the vast majority of cases, a straightforward and low-risk change--but it is a change.
Therefore perhaps the answer isn't an automatic "no". Perhaps additional straightforward, low-risk changes are permissible. The trick is, what constitutes a straightforward, low-risk change? Where should we draw the line? Let's discuss it. Perhaps a consensus will form around an answer besides a flat "no".
THE SPECIFICS
I'm sorting the problems we see into four rough categories.
a) Functions where there's a static Python value that behaves identically to not passing in that parameter (aka "the NULL problem")
Example:
_sha1.sha1(). Its optional parameter has a default value in C of
NULL. We can't express NULL in a Python signature. However, it just so happens that _sha1.sha1(b'') is exactly equivalent to _sha1.sha1(). b'' makes for a fine replacement default value.
Same holds for list.__init__(). its optional "sequence" parameter has
a default value in C of NULL. But this signature:
list.__init__(sequence=())
works fine.
The way Clinic works, we can actually still use the NULL as the
default value in C. Clinic will let you use completely different values as the published default value in Python and the real default value in C. (Consenting adults rule and all that.) So we could lie to Python and everything works just the way we want it to.
Possible Solutions:
0) Do nothing, don't convert the function.
1) Use that clever static value as the default.
b) Functions where there's no static Python value that behaves identically to not passing in that parameter (aka "the dynamic default problem")
There are functions with parameters whose defaults are mildly dynamic,
responding to other parameters.
Example:
I forget its name, but someone recently showed me a builtin that took
a list as its first parameter, and its optional second parameter
defaulted to the length of the list. As I recall this function didn't
allow negative numbers, so -1 wasn't a good fit.
Possible solutions:
0) Do nothing, don't convert the function.
1) Use a magic value as None. Preferably of the same type as the
function accepts, but failing that use None. If they pass in
the magic value use the previous default value. Guido himself
suggested this in
2) Use an Argument Clinic "optional group". This only works for
functions that don't support keyword arguments. Also, I hate
this, because "optional groups" are not expressable in Python
syntax, so these functions automatically have invalid signatures.
c) Functions that accept an 'int' when they mean 'boolean' (aka the "ints instead of bools" problem)
This is specific but surprisingly common.
Before Python 3.3 there was no PyArg_ParseTuple format unit that meant
"boolean value". Functions generally used "i" (int). Even older
functions accepted an object and called PyLong_AsLong() on it.
Passing in True or False for "i" (or PyLong_AsLong()) works, because
boolean inherits from long. But anything other than ints and bools
throws an exception.
In Python 3.3 I added the "p" format unit for boolean arguments.
This calls PyObject_IsTrue() which accepts nearly any Python value.
I assert that Python has a crystal clear definition of what
constitutes "true" and "false". These parameters are clearly
intended as booleans but they don't conform to the boolean
protocol. So I suggest every instance of this is a (very mild!)
bug. But changing these parameters to use "p" is a change: they'll
accept many more values than before.
Right now people convert these using 'int' because that's an exact
match. But sometimes they are optional, and the person doing the
conversion wants to use True or False as a default value, and it
doesn't work: Argument Clinic's type enforcement complains and
they have to work around it. (Argument Clinic has to enforce some
type-safety here because the values are used as defaults for C
variables.) I've been asked to allow True and False as defaults
for "int" parameters specifically because of this.
Example:
str.splitlines(keepends)
Solution:
1) Use "bool".
2) Use "int", and I'll go relax Argument Clinic so they
can use bool values as defaults for int parameters.
d) Functions with behavior that deliberately defy being expressed as a Python signature (aka the "untranslatable signature" problem)
Example:
itertools.repeat(), which behaves differently depending on whether
"times" is supplied as a positional or keyword argument. (If
"times" is <0, and was supplied via position, the function yields
0 times. If "times" is <0, and was supplied via keyword, the
function yields infinitely-many times.)
Solution:
0) Do nothing, don't convert the function.
1) Change the signature until it is Python compatible. This new
signature *must* accept a superset of the arguments accepted
by the existing signature. (This is being discussed right
now in issue #19145.)
//arry/ -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/python-dev/attachments/20140124/af369fc2/attachment.html>
- Previous message: [Python-Dev] Python 3 marketing document?
- Next message: [Python-Dev] Argument Clinic: what to do with builtins with non-standard signatures?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]