[Python-Dev] transitioning from % to {} formatting (original) (raw)
Vinay Sajip vinay_sajip at yahoo.co.uk
Thu Oct 1 01:03:16 CEST 2009
- Previous message: [Python-Dev] PEP 389: argparse - new command line parsing module
- Next message: [Python-Dev] transitioning from % to {} formatting
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Steven Bethard <steven.bethard gmail.com> writes:
There's a lot of code already out there (in the standard library and other places) that uses %-style formatting, when in Python 3.0 we should be encouraging {}-style formatting. We should really provide some sort of transition plan. Consider an example from the logging docs:
logging.Formatter("%(asctime)s - %(name)s - %(levelname)s - %(message)s") We'd like to support both this style as well as the following style: logging.Formatter("{asctime} - {name} - {levelname} - {message}")
In logging at least, there are two different places where the formatting issue crops up.
The first is creating the "message" part of the the logging event, which is made up of a format string and arguments.
The second is the one Steven's mentioned: formatting the message along with other event data such as time of occurrence, level, logger name etc. into the final text which is output.
Support for both % and {} forms in logging would need to be considered in these two places. I sort of liked Martin's proposal about using different keyword arguments, but apart from the ugliness of "dicttemplate" and the fact that "fmt" is already used in Formatter.init as a keyword argument, it's possible that two different keyword arguments "fmt" and "format" both referring to format strings might be confusing to some users.
Benjamin's suggestion of providing a flag to Formatter seems slightly better, as it doesn't change what existing positional or keyword parameters do, and just adds an additional, optional parameter which can start off with a default of False and transition to a default of True.
However, AFAICT these approaches only cover the second area where formatting options are chosen - not the creation of the message from the parameters passed to the logging call itself.
Of course one can pass arbitrary objects as messages which contain their own formatting logic. This has been possible since the very first release but I'm not sure that it's widely used, as it's usually easier to pass strings. So instead of passing a string and arguments such as
logger.debug("The %s is %d", "answer", 42)
one can currently pass, for a fictitious class PercentMessage,
logger.debug(PercentMessage("The %s is %d", "answer", 42))
and when the time comes to obtain the formatted message, LogRecord.getMessage calls str() on the PercentMessage instance, whose str will use %-formatting to get the actual message.
Of course, one can also do for example
logger.debug(BraceMessage("The {} is {}", "answer", 42))
where the str() method on the BraceMessage will do {} formatting.
Of course, I'm not suggesting we actually use the names PercentMessage and BraceMessage, I've just used them there for clarity.
Also, although Raymond has pointed out that it seems likely that no one ever needs both types of format string, what about the case where application A depends on libraries B and C, and they don't all share the same preferences regarding which format style to use? ISTM no-one's brought this up yet, but it seems to me like a real issue. It would certainly appear to preclude any approach that configured a logging-wide or logger-wide flag to determine how to interpret the format string.
Another potential issue is where logging events are pickled and sent over sockets to be finally formatted and output on different machines. What if a sending machine has a recent version of Python, which supports {} formatting, but a receiving machine doesn't? It seems that at the very least, it would require a change to SocketHandler and DatagramHandler to format the "message" part into the LogRecord before pickling and sending. While making this change is simple, it represents a potential backwards-incompatible problem for users who have defined their own handlers for doing something similar.
Apart from thinking through the above issues, the actual formatting only happens in two locations - LogRecord.getMessage and Formatter.format - so making the code do either %- or {} formatting would be simple, as long as it knows which of % and {} to pick.
Does it seems too onerous to expect people to pass an additional "use_format" keyword argument with every logging call to indicate how to interpret the message format string? Or does the PercentMessage/BraceMessage type approach have any mileage? What do y'all think?
Regards,
Vinay Sajip
- Previous message: [Python-Dev] PEP 389: argparse - new command line parsing module
- Next message: [Python-Dev] transitioning from % to {} formatting
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]