Issue 17236: format_spec for sequence joining (original) (raw)

The format specification mini-language (format_spec) supported by format() and str.format() is a feature that allows passing short options to the classes of the values being formatted, to drive their string representation (format method)

The most common operation done to sequences (lists, tuples, sets...) during conversion to string is arguably the string join operation, possibly coupled with a "nested" string formatting of the sequence items.

I propose the addition of a custom format_spec for sequences, that allows to easily specify a string for the join operation and optionally a nested format_spec to be passed along to format the sequence items.

Here is the proposed addition:

seq_format_spec ::= join_string [":" item_format_spec] | format_spec join_string ::= '"' join_string_char* '"' | "'" join_string_char* "'" join_string_char ::= <any character except "{", "}", newline, or the quote> item_format_spec ::= format_spec

In words, if the format_spec for a sequence starts with a single or double quote, it will be interpreted as a join operation, optionally followed by another colon and the format_spec for the sequnce items.

If the format_spec does not start with ' or ", of if the quote is not balanced (does not appear again in the format_spec), then it's assumed to be a generic format string and the implementation would call super(). This ensures backwards compatibility with existing code that may be using object's format implementation on various sequence objects.

Please note I'm NOT proposing a change in the language or in the implementation of format() and str.format(). This is just the addition of a format method to lists, tuples, sets and other sequence classes. The choice of whether to do that in all those sequence classes or as an addition to object's format is an implementation detail.

Examples:

Basic usage: either {0:", "} or {0:', '} when used in a format operation will do this: ", ".join(str(x) for x in argument_0) in a more compact, possibly more efficient, and arguably easier to read syntax.

Nested (regular) format_spec: {0:", ":.1f} will join a list of floats using ", " as the separator and .1f as the format_spec for each float.

Nested join format_spec: {0:"\n":", "} will join a list of lists, using "\n" as the outer separator and ", " as the inner separator. This could go on indefinitely (but will rarely need to do so.)

I do not have a patch ready, but I can work on it and submit it for evaluation, if this enhancement is accepted.