[Python-Dev] Are undocumented exceptions considered bugs? (original) (raw)

Nick Coghlan ncoghlan at gmail.com
Sat Mar 23 16:21:53 CET 2013


On Sat, Mar 23, 2013 at 4:05 AM, Stefan Bucur <stefan.bucur at gmail.com> wrote:

Hi,

I'm not sure this is the right place to ask this question, but I thought I'd give it a shot since it also concerns the Python standard library.

It's the right place to ask :)

I'm writing an automated test case generation tool for Python programs that explores all possible execution paths through a program. When applying this tool on Python's 2.7.3 urllib package, it discovered input strings for which the urllib.urlopen(url) call would raise a TypeError.

That sounds like a really interesting tool.

For instance:

urllib.urlopen('\x00\x00\x00') [...] File "/home/bucur/onion/python-bin/lib/python2.7/urllib.py", line 86, in urlopen return opener.open(url) File "/home/bucur/onion/python-bin/lib/python2.7/urllib.py", line 207, in open return getattr(self, name)(url) File "/home/bucur/onion/python-bin/lib/python2.7/urllib.py", line 462, in openfile return self.openlocalfile(url) File "/home/bucur/onion/python-bin/lib/python2.7/urllib.py", line 474, in openlocalfile stats = os.stat(localname) TypeError: must be encoded string without NULL bytes, not str In the urllib documentation it is only mentioned that the IOError is raised when the connection cannot be established. Since the input passed is a string (and not some other type), is the TypeError considered a bug (either in the documentation, or in the implementation)?

The general answer is that there are certain exceptions that usually aren't documented because almost all code can trigger them if you pass the right kind of invalid argument. For example, almost any API can emit TypeError or AttributeError if you pass an instance of the wrong type, and many can emit ValueError, IndexError or KeyError if you pass an incorrect value. Other errors like SyntaxError, ImportError, NameError and UnboundLocalError usually indicate bugs or environmental configuration issues, so are also typically omitted when documenting the possible exceptions for particular APIs.

In this specific case, the error message is confusing-but-not-really-wrong, due to the "two-types-in-one" nature of Python 2.x strings - 8-bit strings are used as both text sequences (generally not containing NUL characters) and also as arbitrary binary data, including encoded text (quite likely to contain NUL bytes).

I think a bug report for this would be appropriate, with the aim of making that error message less confusing (it's a fairly obscure case, though).

Regards, Nick.

-- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia



More information about the Python-Dev mailing list