[Python-Dev] Parsing f-strings from PEP 498 -- Literal String Interpolation (original) (raw)

Nick Coghlan ncoghlan at gmail.com
Sat Nov 5 08:36:06 EDT 2016


On 5 November 2016 at 04:03, Fabio Zadrozny <fabiofz at gmail.com> wrote:

On Fri, Nov 4, 2016 at 3:15 PM, Eric V. Smith <eric at trueblade.com> wrote:

Using PyParserASTFromString is the easiest possible way to do this. Given a string, it returns an AST node. What could be simpler?

I think that for implementation purposes, given the python infrastructure, it's fine, but for specification purposes, probably incorrect... As I don't think f-strings should accept: f"start {import sys; sys.versioninfo[0];} end" (i.e.: PyParserASTFromString doesn't just return an expression, it accepts any valid Python code, even code which can't be used in an f-string).

f-strings use the "eval" parsing mode, which starts from the "eval_input" node in the grammar (which is only a couple of nodes higher than 'test', allowing tuples via 'testlist' as well as trailing newlines and EOF):

>>> ast.parse("import sys; sys.version_info[0];", mode="eval")
Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
 File "/usr/lib64/python3.5/ast.py", line 35, in parse
   return compile(source, filename, mode, PyCF_ONLY_AST)
 File "<example>", line 1
   import sys; sys.version_info[0];
        ^
SyntaxError: invalid syntax

You have to use "exec" mode to get the parser to allow statements, which is why f-strings don't do that:

>>> ast.dump(ast.parse("import sys; sys.version_info[0];", mode="exec"))
"Module(body=[Import(names=[alias(name='sys', asname=None)]),

Expr(value=Subscript(value=Attribute(value=Name(id='sys', ctx=Load()), attr='version_info', ctx=Load()), slice=Index(value=Num(n=0)), ctx=Load()))])"

The unique aspect for f-strings that means they don't permit some otherwise valid Python expressions is that it also does the initial pre-tokenisation based on:

  1. Look for an opening '{'
  2. Look for a closing '!', ':' or '}' accounting for balanced string quotes, parentheses, brackets and braces

Ignoring the surrounding quotes, and using the atom node from Python's grammar to represent the nesting tracking, and TEXT to stand in for arbitrary text, it's something akin to:

fstring: (TEXT ['{' maybe_pyexpr ('!' | ':' | '}')])+
maybe_pyexpr: (atom | TEXT)+

That isn't quite right, since it doesn't properly account for brace nesting, but it gives the general idea - there's an initial really simple tokenising pass that picks out the potential Python expressions, and then those are run through the AST parser's equivalent of eval().

Cheers, Nick.

-- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia



More information about the Python-Dev mailing list