%-style formatting method (original) (raw)

Extracted as a side-topic from Deprecate str % format operator

While %-style formatting has become much less used in favor of {}-style formatting, particularly boosted through f-strings, there are still reasonable use cases where % placeholders are better suited compared to {} placeholders (e.g. formatting TeX, dynamically creating jinja templates or regexes, where curly braces are often used literally and would need to be escaped).

The % formatting operator has some quirks in interpreting it’s operand: If only a single value is used, the variables can be used directly "%s" % var unless it’s a tuple in which case the the tuple would be unpacked and only formatting would apply to the elements inside.

I propose to add a new str method working on %-placeholders. Naming could be either .pformat() (“percent format”) to rhyme with .format() or .sprintf() to rhyme with other languages. Like .format() the method would support positional or keyword arguments. These would be equivalent:

"spam and {}".format("ham")
"spam and %s".pformat("ham")

"spam and {food}".format(food="ham")
"spam and %(food)s".pformat(food="ham")

Why is this reasonable:

One can still use %-placeholders
One does not have to work around the quirks of the %-formatting operator
Strong analogy to the .format() function, which makes it easier to learn/read.
The % formatting operator can be hard to read (is pattern % count a math operation or a string formatting operation)
The % formatting operator may be less known among newer Python users as they will primarily learn f-strings. A .pformat() or .sprintf() method is easier to look up for them.
Implementation is simple - the method is only a handful lines lines of code when when building it on top of the operator.

Possible counter arguments:

Its use is rare enough that we shouldn’t bother with it. - Yes it’s a tradeoff, but it has its benefits and the needed effort is limited.

jsbueno (Joao S. O. Bueno) April 30, 2025, 2:44pm 2

Things to take in consideration there:
- there are already 46 str methods. (for good or for bad - it is reasonable to argue that "if there are already 46, what would be the difference if there were 47? )
- A callable to perform the action already exists, although may not be obvious to find - but that could be fixed with recipes in the documentations: operator.mod(str, other) will already work.

Also, instead of a str method, this could be placed somewhere else - string or operator modules, for example - and them it could be defined to work with strings and other types which lack the native % functionality, like template strings.

Personally, I’d only think about something like this in a scenario where the string transform would be passed programatically instead of hard-coded - since one of the main advantages of % against, sai .format is its compactness - and in that case, I’d naturally go to operator.mod.

jamestwebber (James Webber) April 30, 2025, 3:06pm 3

Or str.__mod__ for that matter. This would be a non-dunder alias for that method.

timhoffm (Tim Hoffmann) April 30, 2025, 4:05pm 4

The point is to have an easily accessible and readable alternative to the %-operator. I think neither operator.mod(str, other) or str.__mod__(str, other) qualify here.

Rosuav (Chris Angelico) April 30, 2025, 5:51pm 5

def sprintf(fmt, *args):
   return fmt % args

This solves most of the same problems and has a cost of just two lines of code. I’ve used this a number of times.

cjdrake (Chris Drake) April 30, 2025, 6:03pm 6

Agree.

operator.mod("%s %s", ("foo", "bar"))

is not as good as:

"%s %s".pformat("foo", "bar")

timhoffm (Tim Hoffmann) April 30, 2025, 6:25pm 7

Though

sprintf("%s %s", "foo", "bar")

Is not as good as

"%s %s".sprintf("foo", "bar")

And it’s the question how we weigh 6 lines in cpython vs two lines for (estimated) some thousand users.

Rosuav (Chris Angelico) April 30, 2025, 6:35pm 8

I dunno, I’m used to sprintf being a function not a method (since it is one in C, and other derivatives). Not everything has to be a method. Python has things like map and filter as globals rather than as methods, and people are fine with that. The same sprintf function could apply equivalently to str/bytes/bytearray, which would otherwise need to be three copies of the method.

timhoffm (Tim Hoffmann) April 30, 2025, 7:48pm 9

I’d be reluctant to give up the structural similarity to .format(). IMHO that’s one of the selling points.

Rosuav (Chris Angelico) April 30, 2025, 9:11pm 10

That’s fair. It does mean that it has to be a language proposal, though, and can’t simply be a couple of lines at the top of your script.

NeilGirdhar (Neil Girdhar) May 1, 2025, 8:15am 11

I think % should still be avoided in those cases. I wrote a library to format complex TeX diagrams and I just used my own delimiters for the cases in which I needed to do substitutions. In general, if you’re formatting TeX, you’re going to want to do more than just do substitutions, so you’ll need a function anyway.

Also, instead of trying to make % more convenient (and thereby more popular), I think it would be better to progressively act as if it doesn’t exist and find reasonable alternatives.

pekkaklarck (Pekka Klärck) May 1, 2025, 10:40am 12

This would be nice. I just did a large reformatting to our project and as part of that converted %-formatted strings to f-strings. In same cases strings themselves contained so many curly braces that it was better to keep %s placeholders, though. That was itself fine, but using the %-operator felt somewhat ugly and using tmpl.pformat(...) would have been nicer. A bigger problem is that future maintainers and contributors may not know about the footguns the %-operator has.

Another benefit of a dedicated method is that it would make it possible to recommend against using the footgun operator in the docs. Possibly it could even be deprecated in the distant future.

xitop (Xitop) May 1, 2025, 11:16am 13

Both the .pformat method and the sprintf function have an advantage over the plain % formatting. The operator.mod doesn’t have this advantage. I’m talking about passing a single value. It works fine all the time except when that value happens to be a tuple:

var = (1, 2, 3)

msg = "var: %s" % var  # TypeError: not all arguments converted during string formatting
msg = "var: %s" % (var,)  # this is the correct form

msg = operator.mod("var: %s", var)   # same TypeError

def sprintf(fmt, *args):
    return fmt % args

msg = sprintf("var: %s", var)  # OK

The sprintf as shown does not support mappings. It would be easy to add it, but then it would take more than just two lines:

data = {'amount': 25}

msg = "var = %(amount)s" % data   # OK

# sprintf as defined above
msg = sprintf("var = %(amount)s", data)  # TypeError: format requires a mapping

I don’t have much to say about which form (a function or a method) the wrapper around the % operator should have, but I don’t doubt about its usefulness even if the sane handling of a single tuple argument would be the only real feature.

rrolls (Rob Rolls) May 4, 2025, 11:36am 14

+1 to introducing str.pformat exactly as described in the OP.

Calling it sprintf would be OK if people prefer that, though I’m more keen on the pformat spelling.

If pformat were introduced, I’d even be happy for str.__mod__ to be deprecated or removed, but only once all supported Python versions contain pformat. (I expect this won’t happen due to the sheer amount of existing code relying on it - I’m just saying I personally wouldn’t mind.)

I’d say there is a subtle reason why this is better as a method than a function: it clearly separates the format from the args.

Compare

# method
'%s and %d and lots more words to make this long'.pformat(foo.abc, bar.xyz)

# free function
pformat('%s and %d and lots more words to make this long', foo.abc, bar.xyz)

Now although you’re unlikely to pass a literal string as one of the arguments (since you’d usually just make it part of the format string instead), it’s still valid to do so. So when you read the “free function” version, if your eye zones in on the ', immediately before foo, it’s not immediately clear that what is to the left of it is the format string, and not secretly an argument. When you read the “method” version, the separation is obvious, because the separator is '.pformat( rather than ', .

I also personally like the bonus that one does not have to worry about the “single tuple argument” case (which has been briefly mentioned above) - that has bitten me many times. I also feel that a .pformat method is simply more “natural” syntax than a % operator because it might need multiple arguments. Compare

'foo: %d bar' % num
'foo: %d bar, %d baz' % (num1, num2)  # suddenly we need parens
'foo: %r bar' % some_tuple   # WRONG
'foo: %r bar' % (some_tuple,)  # correct, but not at all obvious

'foo: %d bar'.pformat(num)
'foo: %d bar, %d baz'.pformat(num1, num2)  # existence of parens is consistent with above
'foo: %r bar'.pformat(*some_tuple)  # wrong, but you'd never do this by accident
'foo: %r bar'.pformat(some_tuple)   # RIGHT, and obvious

The pformat method is simply more consistent and more obvious, and I feel if str.__mod__ had never existed and we were newly implementing this today, str.pformat would be the obvious choice rather than str.__mod__.

Rosuav (Chris Angelico) May 4, 2025, 12:31pm 15

That’s fair. I’m not sure it’s a sufficiently compelling reason to go to the effort of making it a method (which requires the agreement of the Python core devs and will only exist from version X.Y) as opposed to a stand-alone function (which can be trivially backported). Plus, the method has to be created in three places.

But yes, that is definitely a benefit.

NeilGirdhar (Neil Girdhar) May 5, 2025, 7:41am 16

Since the only motivation given for this feature is working with strings that have a lot of curly braces, wouldn’t it be a lot easier to support changing the delimiters of format? E.g.,

txt = "\begin{equation}x=[price:.2f]\end{equation}!"
print(txt.format(price=49, delimiters='[]'))

This has the advantage of letting developers forget the antiquated %-formatting and just use one format specification everywhere.

This has the added bonus of being able to work with strings that have both curly braces and percent characters.

Rosuav (Chris Angelico) May 5, 2025, 7:54am 17

That’s not the only motivation, there are others too. Percent formatting isn’t “antiquated” just because it’s compatible with older systems.

That said, though, this is a fine idea to discuss. I think it would do better standing on its own two feet than tacked onto this one. Personally, I don’t think I’d use it, but it’s the kind of thing that might well be worth discussing. Note that a simple kwarg wouldn’t work there - that already has meaning - but something else might work. Would be worth considering if it’s possible to do the same with f-strings.

I’d recommend starting a new thread for that, it has some promise.

rrolls (Rob Rolls) May 5, 2025, 10:07am 18

No it isn’t, and no it doesn’t.

str.format has a big downside compared to %-formatting. Which is why people happily using %-formatting today will keep using it no matter how much you try to push them to switch.

From the docs:

The string on which this method is called can contain literal text or replacement fields delimited by braces {}. Each replacement field contains either the numeric index of a positional argument, or the name of a keyword argument.

So when using str.format, you either have to think of a name for every substitution and then write it twice to pass it as a kwarg, or you have to number the substitutions and then adjust the numbering every time you alter the message.

Suppose you want to make the following code change involving a %-formatted string:

# old
print('%s: starting %s' % (obj.name, obj.next))
# new
print('%s: %s done, starting %s' % (obj.name, obj.prev, obj.next))

As you can see, %s done, was inserted along with obj.prev, .

With str.format, you either have to additionally renumber ({1} is changed to {2}):

# old
print('{0}: starting {1}'.format(obj.name, obj.next))
# new
print('{0}: {1} done, starting {2}'.format(obj.name, obj.prev, obj.next))

or write a lot of names multiple times - which IMO makes the code less clear due to the additional boilerplate:

# old
print('{name}: starting {next}'.format(name=obj.name, next=obj.next))
# new
print('{name}: {prev} done, starting {next}'.format(
    name=obj.name, prev=obj.prev, next=obj.next))

This was a made-up, minimal example - in real-world usage with longer format strings and more arguments it makes a much bigger difference.

Yes, %-formatting has a downside too: if you need to reorder where the substitutions appear in the format string, you need to reorder the arguments as well, which limits what you can do with dynamic format strings. But that’s a rare case - and guess what? I, despite using %-formatting as my go-to, have used str.format on the odd occasion when this was my exact use case.

They are different tools for different problems.

Rosuav (Chris Angelico) May 5, 2025, 10:12am 19

Which, by the way, is a solvable problem. Python could lift this notation: sprintf("%[1]s -- %[0]s", "spam", "ham") Empty braces in .format() have the same effect and can similarly have a number added.

The minilanguage used by .format() is more verbose, the one used by printf is more compact, which means there are times you’ll want the one, and times you’ll want the other.

MegaIng (Cornelius Krupp) May 5, 2025, 10:51am 20

Or you can just not use either and get the exact same behavior as with %s. '{}: {} done, starting {}' is perfectly valid. (Which yes, isn’t explained in that quote from the docs. You need to follow the link afterwards to the full docs to learn about that.)