[Python-Dev] Informal educator feedback on PEP 572 (was Re: 2018 Python Language Summit coverage, last part) (original) (raw)

Tim Peters tim.peters at gmail.com
Sun Jul 1 00:32:03 EDT 2018


[Nick Coghlan]

"NAME := EXPR" exists on a different level of complexity, since it adds name binding in arbitrary expressions for the sake of minor performance improvement in code written by developers that are exceptionally averse to the use of vertical screen real estate,

>>> ...

[Tim]

Note that PEP 572 doesn't contain a single word about "performance" (neither

>> that specific word nor any synonym), and I gave only one thought to it when

>> writing Appendix A: "is this going to slow anything down significantly?".

>> The answer was never "yes", which I thought was self-evident, so I never

>> mentioned it. Neither did Chris or Guido.

>>

>> Best I can recall, nobody has argued for it on the grounds of "performance".

>> except in the indirect sense that sometimes it allows a more compact way of

>> reusing an expensive subexpression by giving it a name. Which they already

>> do by giving it a name in a separate statement, so the possible improvement

>> would be in brevity rather than performance.

[Nick]

> The PEP specifically cites this example as motivation:

The PEP gives many examples. Your original was a strawman mischaracterization of the PEP's motivations (note the plural: you only mentioned "minor performance improvement", and snipped my listing of the major motivations).

group = re.match(data).group(1) if re.match(data) else None

>

> That code's already perfectly straightforward to read and write as a

> single line,

I disagree. In any case of textual repetition, it's a visual pattern-matching puzzle to identify the common substrings (I have to visually scan that line about 3 times to be sure), and then a potentially difficult conceptual puzzle to figure out whether side effects may result in textually identical substrings evaluating to different objects. That's why "refererential transparency" is so highly valued in functional languages ("if subexpressions are spelled the same, they evaluate to the same result, period" - which isn't generally true in Python - to get that enormously helpful (to reasoning) guarantee in Python you have to ensure the subexpression is evaluated exactly once).

And as you of all people should be complaining about, textual repetition is also prone to "oops - forgot one!" and "oops! made a typo when changing the second one!" when code is later modified.

so the only reason to quibble about it

I gave you three better reasons to quibble about it just above ;-)

is because it's slower than the arguably less clear two-line alternative:

>

> m = re.match(data)

> group = m.group(1) if m else None

I find that much clearer than the one-liner above: the visual pattern matching is easier because the repeated substring is shorter and of much simpler syntactic structure; it guarantees by construction that the two instances of _m evaluate to the same object, so there's no possible concern about that (it doesn't even matter if you bound re to some "non-standard" object that has nothing to do with Python's re module); and any later changes to the single instance of re.match(data) don't have to be repeated verbatim elsewhere. It's possible that it runs twice as fast too, but that's the least of my concerns.

All of those advantages are retained in the one-liner too if an assignment expression can be used in it.

Thus the PEP's argument is that it wants to allow the faster version

> to remain a one-liner that preserves the overall structure of the

> version that repeats the subexpression:

>

> group = m.group(1) if m := re.match(data) else None

>

> That's a performance argument, not a readability one (as if you don't

> care about performance, you can just repeat the subexpression).

How does that differ from the part of what I said that you did retain above?

sometimes it allows a more compact way of reusing an expensive subexpression by giving it a name. Which they already do by giving it a name in a separate statement, so the possible improvement would be in brevity rather than performance.

You already realized the performance gain could be achieved by using two statements. The additional performance gain by using assignment expressions is at best trivial (it may save a LOAD_FAST opcode to fetch the object bound to _m for the if test).

So, no, gaining performance is not the motivation here. You already had a way to make it "run fast'. The motivation is the brevity assignment expressions allow while retaining all of the two-statement form's advantages in easier readability, easier reasoning, reduced redundancy, and performance.

As Guido said, in the PEP, of the example you gave here:

Guido found several examples where a programmer repeated a subexpression, slowing down the program, in order to save one line of code

It couldn't possibly be clearer that Guido thought the programmer's motivation was brevity ("in order to save one line of code"). Guido only happened to mention that they were willing to slow down the code to get that brevity, but, as above, they were also willing to make the code harder to read, reason about, and maintain. With the assignment expression, they don't have to give up any of the latter to get the brevity they mistakenly think ;-) they care most about - and, indeed, they can make it even briefer.

I sure don't count it against the PEP that it may trick people overly concerned with brevity into writing code that's clearer and faster too, but that's a tiny indirect part of the PEP's motivation_s_ (note the plural again). -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/python-dev/attachments/20180630/fc1d94b3/attachment.html>



More information about the Python-Dev mailing list