[Python-Dev] bpo-34595: How to format a type name? (original) (raw)
MRAB python at mrabarnett.plus.com
Tue Sep 11 19:06:42 EDT 2018
- Previous message (by thread): [Python-Dev] bpo-34595: How to format a type name?
- Next message (by thread): [Python-Dev] bpo-34595: How to format a type name?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On 2018-09-11 23:23, Victor Stinner wrote:
Hi,
Last week, I opened an issue to propose to add a new %T formatter to PyUnicodeFromFormatV() and so indirectly to PyUnicodeFromFormat() and PyErrFormat(): https://bugs.python.org/issue34595 I merged my change, but then Serhiy Storchaka asked if we can add something to get the "fully qualified name" (FQN) of a type, ex "datetime.timedelta" (FQN) vs "timedelta" (what I call "short" name). I proposed a second pull request to add %t (short) in addition to %T (FQN). But then Petr Viktorin asked me to open a thread on python-dev to get a wider discussion. So here I am.
The rationale for this change is to fix multiple issues: * C extensions use PyTYPE(obj)->tpname which returns a fully qualified name for C types, but the name (without the module) for Python name. Python modules use type(obj).name which always return the short name. * currently, many C extensions truncate the type name: use "%.80s" instead of "%s" to format a type name * "%s" with PyTYPE(obj)->tpname is used more than 200 times in the C code, and I dislike this complex pattern. IMHO "%t" with obj would be simpler to read, write and maintain. * I want C extensions and Python modules to have the same behavior: respect the PEP 399. Petr considers that error messages are not part of the PEP 399, but the issue is wider than only error messages. The main issue is that at the C level, PyTYPE(obj)->tpname is "usually" the fully qualified name for types defined in C, but it's only the "short" name for types defined in Python. For example, if you get the C accelerator "datetime", PyTYPE(obj)->tpname of a datetime.timedelta object gives you "datetime.timedelta", but if you don't have the accelerator, tpname is just "timedelta". Another example, this script displays "mytimedelta(0)" if you have the C accelerator, but "main.mytimedelta(0)" if you use the Python implementation: --- import sys #sys.modules['datetime'] = None import datetime class mytimedelta(datetime.timedelta): pass print(repr(mytimedelta())) --- So I would like to fix this kind of issue. Type names are mainly used for two purposes: * format an error message * obj.repr() It's unclear to me if we should use the "short" or the "fully qualified" name. It should maybe be decided on a case by case basis. There is also a 3rd usage: to implement reduce, here backward compatibility matters. Note: The discussion evolved since my first implementation of %T which just used the not well defined PyTYPE(obj)->tpname. -- Petr asked me why not exposing functions to get these names. For example, with my second PR (not merged), there are 3 (private) functions: /* type.name */ const char* PyTypeName(PyTypeObject *type); /* type.qualname */ PyObject* PyTypeQualName(PyTypeObject *type); * type.module "." type.qualname (but type.qualname for builtin types) */ PyObject * PyTypeFullName(PyTypeObject *type); My concern here is that each caller has to handler error: PyErrFormat(PyExcTypeError, "must be str, not %.100s", PyTYPE(obj)->tpname); would become: PyObject *typename = PyTypeFullName(PyTYPE(obj)); if (name == NULL) { /* do something with this error ... */ PyErrFormat(PyExcTypeError, "must be str, not %U", typename); PyDECREF(name); When I report an error, I dislike having to handle new errors... I prefer that the error handling is done inside PyErrFormat() for me, to reduce the risk of additional bugs. -- Serhiy also asked if we could expose the same feature at the Python level: provide something to get the fully qualified name of a type. _It's not just f"{type(obj).module}.{type(obj).name}", but you have to skip the module for builtin types like "str" (not return "builtins.str"). Maybe we can have "name: {0:t}, FQN: {0:T}".format(type(obj)). "t" for name and "T" for fully qualfied name. We would only have to modify type.format(). I'm not sure if we need to add new formatters to str % args. Example of Python code: raise TypeError("must be str, not %s" % type(fmt).name) I'm not sure about Python changes. My first concern was just to avoid PyTYPE(obj)->tpname at the C level. But again, we should keep C and Python consistent. If the behavior of C extensions change, Python modules should be adapted as well, to get the same behavior. Note: I reverted my change which added the %T formatter from PyUnicodeFromFormatV() to clarify the status of this issue. I'm not sure about having 2 different, though similar, format codes for 2 similar, though slightly different, cases. (And, for all we know, we might want to use "%t" at some later date for something else.)
Perhaps we could have a single format code plus an optional '#' for the "alternate form":
%T for short form %#T for fully qualified name
- Previous message (by thread): [Python-Dev] bpo-34595: How to format a type name?
- Next message (by thread): [Python-Dev] bpo-34595: How to format a type name?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]