[Python-Dev] bpo-34595: How to format a type name? (original) (raw)

Victor Stinner vstinner at redhat.com
Tue Sep 11 18:23:45 EDT 2018


Hi,

Last week, I opened an issue to propose to add a new %T formatter to PyUnicode_FromFormatV() and so indirectly to PyUnicode_FromFormat() and PyErr_Format():

https://bugs.python.org/issue34595

I merged my change, but then Serhiy Storchaka asked if we can add something to get the "fully qualified name" (FQN) of a type, ex "datetime.timedelta" (FQN) vs "timedelta" (what I call "short" name). I proposed a second pull request to add %t (short) in addition to %T (FQN).

But then Petr Viktorin asked me to open a thread on python-dev to get a wider discussion. So here I am.

The rationale for this change is to fix multiple issues:

The main issue is that at the C level, Py_TYPE(obj)->tp_name is "usually" the fully qualified name for types defined in C, but it's only the "short" name for types defined in Python.

For example, if you get the C accelerator "_datetime", PyTYPE(obj)->tp_name of a datetime.timedelta object gives you "datetime.timedelta", but if you don't have the accelerator, tp_name is just "timedelta".

Another example, this script displays "mytimedelta(0)" if you have the C accelerator, but "main.mytimedelta(0)" if you use the Python implementation:

import sys #sys.modules['_datetime'] = None import datetime

class mytimedelta(datetime.timedelta): pass

print(repr(mytimedelta()))

So I would like to fix this kind of issue.

Type names are mainly used for two purposes:

It's unclear to me if we should use the "short" or the "fully qualified" name. It should maybe be decided on a case by case basis.

There is also a 3rd usage: to implement reduce, here backward compatibility matters.

Note: The discussion evolved since my first implementation of %T which just used the not well defined Py_TYPE(obj)->tp_name.

--

Petr asked me why not exposing functions to get these names. For example, with my second PR (not merged), there are 3 (private) functions:

/* type.name / const char _PyType_Name(PyTypeObject type); / type.qualname / PyObject _PyType_QualName(PyTypeObject *type);

My concern here is that each caller has to handler error:

PyErr_Format(PyExc_TypeError, "must be str, not %.100s", Py_TYPE(obj)->tp_name);

would become:

PyObject type_name = _PyType_FullName(Py_TYPE(obj)); if (name == NULL) { / do something with this error ... */ PyErr_Format(PyExc_TypeError, "must be str, not %U", type_name); Py_DECREF(name);

When I report an error, I dislike having to handle new errors... I prefer that the error handling is done inside PyErr_Format() for me, to reduce the risk of additional bugs.

--

Serhiy also asked if we could expose the same feature at the Python level: provide something to get the fully qualified name of a type. It's not just f"{type(obj).__module}.{type(obj).name}", but you have to skip the module for builtin types like "str" (not return "builtins.str").

Maybe we can have "name: {0:t}, FQN: {0:T}".format(type(obj)). "t" for name and "T" for fully qualfied name. We would only have to modify type.format().

I'm not sure if we need to add new formatters to str % args.

Example of Python code:

raise TypeError("must be str, not %s" % type(fmt).name)

I'm not sure about Python changes. My first concern was just to avoid Py_TYPE(obj)->tp_name at the C level. But again, we should keep C and Python consistent. If the behavior of C extensions change, Python modules should be adapted as well, to get the same behavior.

Note: I reverted my change which added the %T formatter from PyUnicode_FromFormatV() to clarify the status of this issue.

Victor



More information about the Python-Dev mailing list