[Python-Dev] Avoid formatting an error message on attribute error (original) (raw)

Brett Cannon brett at python.org
Thu Nov 7 19:44:39 CET 2013


On Thu, Nov 7, 2013 at 7:41 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:

On 7 Nov 2013 21:34, "Victor Stinner" <victor.stinner at gmail.com> wrote: > > 2013/11/7 Steven D'Aprano <steve at pearwood.info>: > > My initial instinct here was to say that sounded like premature > > optimization, but to my surprise the overhead of generating the error > > message is actually significant -- at least from pure Python 3.3 code. > > I ran a quick and dirty benchmark by replacing the error message with None. > > Original: > > $ ./python -m timeit 'hasattr(1, "y")' > 1000000 loops, best of 3: 0.354 usec per loop > $ ./python -m timeit -s 's=1' 'try: s.y' 'except AttributeError: pass' > 1000000 loops, best of 3: 0.471 usec per loop > > Patched: > > $ ./python -m timeit 'hasattr(1, "y")' > 10000000 loops, best of 3: 0.106 usec per loop > $ ./python -m timeit -s 's=1' 'try: s.y' 'except AttributeError: pass' > 10000000 loops, best of 3: 0.191 usec per loop > > hasattr() is 3.3x faster and try/except is 2.4x faster on such micro benchmark. > > > Given that, I wonder whether it would be worth coming up with a more > > general solution to the question of lazily generating error messages > > rather than changing AttributeError specifically. > > My first question is about keeping strong references to objects (type > object for AttributeError). Is it an issue? If it is an issue, it's > maybe better to not modify the code :-) > > > Yes, the lazy formatting idea can be applied to various other > exceptions. For example, TypeError message is usually build using > PyErrFormat() to mention the name of the invalid type. Example: > > PyErrFormat(PyExcTypeError, "exec() arg 2 must be a dict, not %.100s", > globals->obtype->tpname); > > But it's not easy to store arbitary C types for PyUnicodeFromFormat() > parameters. Argument types can be char*, Pyssizet, PyObject*, int, > etc. > > I proposed to modify (first/only) AttributeError, because it is more > common to ignore the AttributeError than other errors like TypeError. > (TypeError or UnicodeDecodeError are usually unexpected.) > > >> It would be nice to only format the message on demand. The > >> AttributeError would keep a reference to the type. > > > > Since only the type name is used, why keep a reference to the type > > instead of just type.name? > > In the C language, type.name does not exist, it's a char* object. > If the type object is destroyed, type->tpname becomes an invalid > pointer. So AttributeError should keep a reference on the type object. > > >> AttributeError.args would be (type, attr) instead of (message,). > >> ImportError was also modified to add a new "name "attribute". The existing signature continued to be supported, though. > > > > I don't like changing the signature of AttributeError. I've got code > > that raises AttributeError explicitly. > > The constructor may support different signature for backward > compatibility: AttributeError(message: str) and AttributeError(type: > type, attr: str). > > I'm asking if anyone relies on AttributeError.args attribute. The bigger problem is you can't change the constructor signature in a backwards incompatible way. You would need a new class method as an alternative constructor instead, or else use optional parameters.

The optional parameter approach is the one ImportError took for introducing its name and path attributes, so there is precedent. Currently it doesn't use a default exception message for backwards-compatibility, but adding one wouldn't be difficult technically.

In the case of AttributeError, what you could do is follow the suggestion I made in http://bugs.python.org/issue18156 and add an attr keyword-only argument and correpsonding attribute to AttributeError. If you then add whatever other keyword arguments are needed to generate a good error message (object so you have the target, or at least get the string name to prevent gc issues for large objects?) you could construct the message lazily in the str method very easily while also doing away with inconsistencies in the message which should make doctest users happy. Lazy message creation through str does leave the message out of args, though.

In a perfect world (Python 4 maybe?) BaseException would take a single argument which would be an optional message, args wouldn't exist, and people called str(exc) to get the message for the exception. That would allow subclasses to expand the API with keyword-only arguments to carry extra info and have reasonable default messages that were built on-demand when str was called. It would also keep args from just being a dumping ground of stuff that has no structure except by calling convention (which is not how to do an API; explicit > implicit and all). IOW the original dream of PEP 352 ( http://python.org/dev/peps/pep-0352/#retracted-ideas). -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/python-dev/attachments/20131107/a582e2d5/attachment-0001.html>



More information about the Python-Dev mailing list