[Python-Dev] Exception.unicode and tp_unicode (original) (raw)

Simon Cross hodgestar+pythondev at gmail.com
Tue Jun 10 18:31:13 CEST 2008


Originally Python exceptions had no unicode method. In Python 2.5 unicode was added. This led to "unicode(Exception)" failing and so the addition of unicode was reverted [1].

This leaves Python 2.6 in a position where calls to "unicode(Exception(u'\xe1'))" fail as they are equivalent to "uncode(str(Exception(u'\xe1'))" which cannot convert the non-ASCII character to ASCII (or other default encoding) [2].

From here there are 3 options:

  1. Leave things as they are.
  2. Add back unicode and have "unicode(Exception)" fail.
  3. Add a tp_unicode slot to Python objects and have everything work (at the cost of adding the slot).

Each option has its draw backs.

Ideally I'd like to see 3) implemented (there are already two volunteers for and some initial stabs at implementing it) but a change to Object is going to need an okay from someone quite high up. Also, if you know of any code this would break, now is the time to let me know.

If we can't have 3) I'd like to see us fall back to option 2). Passing unicode exceptions back is useful in a number of common situations (non-English exception messages, database errors, pretty much anywhere that something goes wrong while dealing with potentially non-ASCII text) and encoding to some specific format is usually not an option since there is no way to know where the exception will eventually be caught. Also, unicode(ClassA) already fails for any class that implements unicode so even without this effecting Exception it's already not safe to do u"%s" % SomeClass. Also, there is a readily available work around by doing u"%s" % str(SomeClass).

I'm opposed to 1) because a full work around means doing something like:

def unicode_exception(e): if len(e.args) == 0: return u"" elif len(e.args) == 1: return unicode(e.args[0]) else: return unicode(e.args)

and then using unicode_exception(...) instead of unicode(...) whenever one needs to get a unicode value for an exception.

The issue doesn't affect Python 3.0 where unicode(...) is replaced by str(...).

[1] http://bugs.python.org/issue1551432 [2] http://bugs.python.org/issue2517

Schiavo Simon



More information about the Python-Dev mailing list