[Python-Dev] PEP 498 (interpolated f-string) tweak (original) (raw)

Nick Coghlan ncoghlan at gmail.com
Mon Sep 21 10:31:24 CEST 2015


On 21 September 2015 at 05:22, Eric V. Smith <eric at trueblade.com> wrote:

On Sep 20, 2015, at 11:15 AM, Serhiy Storchaka <storchaka at gmail.com> wrote:

On 20.09.15 16:51, Eric V. Smith wrote: On 9/20/2015 8:37 AM, Nick Coghlan wrote: On 19 September 2015 at 21:03, Eric V. Smith <eric at trueblade.com> wrote: Instead of calling format, I've changed the code generator to call format(expr1, spec1). As an optimization, I might add special opcodes to deal with this and string concatenation, but that's for another day (if ever).

Does this mean overriding format at the module level or in builtins will affect the way f-strings are evaluated at runtime? (I don't have a strong preference one way or the other, but I think the PEP should be explicit as to the expected behaviour rather than leaving it as implementation defined). Yes, in the current implementation, if you mess with format(), str(), repr(), or ascii() you can break f-strings. The latter 3 are used to implement !s, !r, and !a. I have a plan to change this, by adding one or more opcodes to implement the formatting and string joining. I'll defer a decision on updating the PEP until I can establish the feasibility (and desirability) of that approach. I propose to add internal builting formatter type. Instances should be marshallable and callable. The code generated for f-strings should just load formatter constant and call it with arguments. The formatter builds resulting string by concatenating literal strings and results of formatting arguments with specified specifications. Later we could change compiler (just peephole optimizer?) to replace literalstring.format(*args) and literalstring % args with calling precompiled formatter. Later we could rewrite str.format, str.mod and re.sub to create temporary formatter object and call it. Later we could expose public API for creating formatter object. It can be used by third-party template engines. I think this is InterpolationTemplate from PEP 501.

It's certainly a similar idea, although PEP 501 just proposed storing strings and tuples on the code object, with the interpolation template itself still being a mutable object constructed at runtime. Serhiy's suggestion goes a step further to suggest making the template itself immutable, and passing in all the potentially mutable data as method arguments.

I think there's a simpler approach available though, which is to go the way we went in introducing first the import builtin and later the build_class builtin to encapsulate some of the complexity of their respective statements without requiring a raft of new opcodes.

The last draft of PEP 501 before I deferred it proposed the following for interpolation templates, since it was able to rely on having f-strings available as a primitive and wanted to offer more flexibility than string formatting needs:

_raw_template = "Substitute {names} and {expressions()} at runtime"
_parsed_template = (
    ("Substitute ", "names"),
    (" and ", "expressions()"),
    (" at runtime", None),
)
_field_values = (names, expressions())
_format_specifiers = (f"", f"")
template = types.InterpolationTemplate(_raw_template,
                                      _parsed_template,
                                      _field_values,
                                      _format_specifiers)

A format builtin (or a dedicated opcode) could use a simpler data model that consisted of the following constant and variable elements:

Compile time constant: tuple of (, ) pairs Runtime variable: tuple of (, ) pairs

If the format string didn't end with a substitution field, then the runtime variable tuple would be 1 element shorter than the constant tuple.

With that approach, then format (or an opcode that popped these two tuples directly off the stack) could be defined as something like:

def __format__(constant_parts, variable_parts):
    num_fields = len(variable_parts)
    segments = []
    for idx, (leading_text, specifier_constants) in constant_parts:
        segments.append(leading_text)
        if idx < num_fields:
            field_value, specifier_variables = variable_parts[idx]
            if specifier_variables:
                specifier = __format__(specifier_constants,

specifier_variables) else: assert len(specifier_constants) == 1 specifier = specifier_constants[0] if specifier.startswith("!"): # Handle "!a", "!r", "!s" by modifying field_value and specifier if specifier: segments.append(format(field_value, specifier) return "".join(segments)

Cheers, Nick.

-- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia



More information about the Python-Dev mailing list