Issue 28850: Regression in Python 3: Subclassing PrettyPrinter.format doesn't work anymore (original) (raw)

This issue was previously addressed and fixed here:

http://bugs.python.org/issue1351692

When subclassing PrettyPrinter, overriding the format() method should allow users to define custom pretty-printers.

However, for objects whose repr is short, format() is not called for the individual members.

Example code that reproduces the issue is as follows:

import pprint import sys

pprint.pprint(sys.version_info)

class MyPrettyPrinter(pprint.PrettyPrinter): def format(self, object, context, maxlevels, level): if isinstance(object, int): return hex(object), True, False else: return pprint.PrettyPrinter.format(self, object, context, maxlevels, level)

MyPrettyPrinter().pprint(10) MyPrettyPrinter().pprint([10])

When run with different versions of Python:

sys.version_info(major=2, minor=7, micro=11, releaselevel='final', serial=0) 0xa [0xa]

(3, 0, 1, 'final', 0) 0xa [10]

sys.version_info(major=3, minor=2, micro=5, releaselevel='final', serial=0) 0xa [10]

sys.version_info(major=3, minor=4, micro=4, releaselevel='final', serial=0) 0xa [10]

sys.version_info(major=3, minor=6, micro=0, releaselevel='beta', serial=4) 0xa [10]

You'll notice that the regression exists in all versions >= 3.0, even though the commit that fixed the issue is still in the log (2008-01-20): https://hg.python.org/cpython/log/3.0/Lib/pprint.py

I have not had the time to look at the source of the issue or provide a fix; I might do so tonight.

This issue only impacts container objects where the len(repr(o)) is less than width. If the length is greater than width, containers are handled by a different code path which is covered by a unit test and works correctly.

For this case, indeed it works in Python 2 but not in Python 3. The PrettyPrinter._format code has been refactored quite a lot with respect to handling of containers.

In both Python 2 and 3 this happens:

PrettyPrinter.pprint([10]) calls PrettyPrinter._format([10]) which calls PrettyPrinter._repr([10]) which calls MyPrettyPrinter.format([10]) which calls PrettyPrinter.format([10]) which calls _safe_repr([10]) which calls _safe_repr(10) returns 10 returns [10] returns [10] returns [10] returns [10]

But then they diverge - in Python 3 the [10] is returned as the result, but in Python 2 there is another piece of code (starting here https://github.com/python/cpython/blob/fdda200195f9747e411d3491aae0806bc1fcd919/Lib/pprint.py#L179) which overrides this result and recalculates the representation for containers. In our case it does:

    since issubclass(type([10]), list):
        # ignore the [10] calculated above, now call
        self._format(10) which calls
            PrettyPrinter._repr(10) which calls
                MyPrettyPrinter.format(10)
                returns 0xa
            returns 0xa
        returns 0xa
    returns [0xa]

This explains the difference between Python 2 and 3.

As to why the first calculation returns [10] and not [0xa]: This is because _safe_repr is defined at module scope, so once it is called we are no longer on the PrettyPrinter instance, and the format() override is no longer accessible. When _safe_repr needs to recurse on container contents, it calls itself.

I think the solution is to make _safe_repr a method of PrettyPrinter, and make it call self.format() when it needs to recurse. The default self.format() just calls _safe_repr, so when there is no override the result is the same.

The PR's diff looks substantial, but that's mostly due to the indentation of _safe_repr code. To make it easier to review, I split it into several commits, where the indentation is done in one commit as a noop change. The other commits have much smaller diffs.