Issue 28850: Regression in Python 3: Subclassing PrettyPrinter.format doesn't work anymore (original) (raw)
This issue was previously addressed and fixed here:
http://bugs.python.org/issue1351692
When subclassing PrettyPrinter, overriding the format() method should allow users to define custom pretty-printers.
However, for objects whose repr is short, format() is not called for the individual members.
Example code that reproduces the issue is as follows:
import pprint import sys
pprint.pprint(sys.version_info)
class MyPrettyPrinter(pprint.PrettyPrinter): def format(self, object, context, maxlevels, level): if isinstance(object, int): return hex(object), True, False else: return pprint.PrettyPrinter.format(self, object, context, maxlevels, level)
MyPrettyPrinter().pprint(10) MyPrettyPrinter().pprint([10])
When run with different versions of Python:
sys.version_info(major=2, minor=7, micro=11, releaselevel='final', serial=0) 0xa [0xa]
(3, 0, 1, 'final', 0) 0xa [10]
sys.version_info(major=3, minor=2, micro=5, releaselevel='final', serial=0) 0xa [10]
sys.version_info(major=3, minor=4, micro=4, releaselevel='final', serial=0) 0xa [10]
sys.version_info(major=3, minor=6, micro=0, releaselevel='beta', serial=4) 0xa [10]
You'll notice that the regression exists in all versions >= 3.0, even though the commit that fixed the issue is still in the log (2008-01-20): https://hg.python.org/cpython/log/3.0/Lib/pprint.py
I have not had the time to look at the source of the issue or provide a fix; I might do so tonight.
This issue only impacts container objects where the len(repr(o)) is less than width. If the length is greater than width, containers are handled by a different code path which is covered by a unit test and works correctly.
For this case, indeed it works in Python 2 but not in Python 3. The PrettyPrinter._format code has been refactored quite a lot with respect to handling of containers.
In both Python 2 and 3 this happens:
PrettyPrinter.pprint([10]) calls PrettyPrinter._format([10]) which calls PrettyPrinter._repr([10]) which calls MyPrettyPrinter.format([10]) which calls PrettyPrinter.format([10]) which calls _safe_repr([10]) which calls _safe_repr(10) returns 10 returns [10] returns [10] returns [10] returns [10]
But then they diverge - in Python 3 the [10] is returned as the result, but in Python 2 there is another piece of code (starting here https://github.com/python/cpython/blob/fdda200195f9747e411d3491aae0806bc1fcd919/Lib/pprint.py#L179) which overrides this result and recalculates the representation for containers. In our case it does:
since issubclass(type([10]), list):
# ignore the [10] calculated above, now call
self._format(10) which calls
PrettyPrinter._repr(10) which calls
MyPrettyPrinter.format(10)
returns 0xa
returns 0xa
returns 0xa
returns [0xa]
This explains the difference between Python 2 and 3.
As to why the first calculation returns [10] and not [0xa]: This is because _safe_repr is defined at module scope, so once it is called we are no longer on the PrettyPrinter instance, and the format() override is no longer accessible. When _safe_repr needs to recurse on container contents, it calls itself.
I think the solution is to make _safe_repr a method of PrettyPrinter, and make it call self.format() when it needs to recurse. The default self.format() just calls _safe_repr, so when there is no override the result is the same.
The PR's diff looks substantial, but that's mostly due to the indentation of _safe_repr code. To make it easier to review, I split it into several commits, where the indentation is done in one commit as a noop change. The other commits have much smaller diffs.