Issue 2292: Missing *-unpacking generalizations (original) (raw)

Created on 2008-03-15 15:41 by twouters, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Messages (172)

msg63548 - (view)

Author: Thomas Wouters (twouters) * (Python committer)

Date: 2008-03-15 15:41

The attached patch adds the missing *-unpacking generalizations. Specifically:

a, b, *c = range(5)

*a, b, c = a, b, *c a, b, c ([0, 1, 2], 3, 4) [ *a, b, c ] [0, 1, 2, 3, 4] L = [ a, (3, 4), {5}, {6: None}, (i for i in range(7, 10)) ] [ *item for item in L ] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

Also, yielding everything from an iterator:

def flatten(iterables): ... for it in iterables: ... yield *it ... L = [ a, (3, 4), {5}, {6: None}, (i for i in range(7, 10)) ] flatten(L) [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

msg63550 - (view)

Author: Alexander Belopolsky (belopolsky) * (Python committer)

Date: 2008-03-15 16:09

This was discussed years ago and never got enough support:

http://mail.python.org/pipermail/python-dev/2005-October/057177.html

msg63551 - (view)

Author: Guido van Rossum (gvanrossum) * (Python committer)

Date: 2008-03-15 16:12

Didn't you say it does sets too? Does this work? a = [1, 2, 3] {1, *a, 0, 4} # {0, 1, 2, 3, 4}

How about dicts? kwds = {'z': 0, 'w': 12} {'x': 1, 'y': 2, **kwds} # {'x': 1, 'y': 2, 'z': 0, 'w': 12}

Also, now that we support

[*a, b, c]

shouldn't we also support

foo(*a, b, c)

msg63552 - (view)

Author: Thomas Wouters (twouters) * (Python committer)

Date: 2008-03-15 16:14

On Sat, Mar 15, 2008 at 9:12 AM, Guido van Rossum <report@bugs.python.org> wrote:

Guido van Rossum <guido@python.org> added the comment:

Didn't you say it does sets too? Does this work? a = [1, 2, 3] {1, *a, 0, 4} # {0, 1, 2, 3, 4}

Yes.

How about dicts? kwds = {'z': 0, 'w': 12} {'x': 1, 'y': 2, **kwds} # {'x': 1, 'y': 2, 'z': 0, 'w': 12}

Not yet.

Also, now that we support

[*a, b, c]

shouldn't we also support

foo(*a, b, c)

Sure. (And also 'foo(*a, *b, *c)'?) But have you taken a look lately at the function definition grammar? I need some time to sort it out :)

msg63553 - (view)

Author: Guido van Rossum (gvanrossum) * (Python committer)

Date: 2008-03-15 16:18

Looking at the flatten() example I'm curious -- how come the output of

flatten(L)

is displayed as a list rather than as ?

Also, do I understand correctly that

yield *(1, 2, 3)

is equivalent to

yield 1 yield 2 yield 3

? (That's really cool.)

msg63554 - (view)

Author: Thomas Wouters (twouters) * (Python committer)

Date: 2008-03-15 16:22

On Sat, Mar 15, 2008 at 9:18 AM, Guido van Rossum <report@bugs.python.org> wrote:

Guido van Rossum <guido@python.org> added the comment:

Looking at the flatten() example I'm curious -- how come the output of

flatten(L)

is displayed as a list rather than as ?

It's a typo. It should've been list(flatten(L)) :-) (see the tests included in the patch.)

msg63557 - (view)

Author: Alexander Belopolsky (belopolsky) * (Python committer)

Date: 2008-03-15 16:55

It looks like I completely missed PEP 3132. Sorry for a misleading comment above.

At least I am not alone: "A little new feature in Python 3 that many overviews don't tell you about: extended unpacking." http://pyside.blogspot.com/2007/10/new-in-python-3-extended- unpacking.html

msg63563 - (view)

Author: Georg Brandl (georg.brandl) * (Python committer)

Date: 2008-03-15 20:15

Just a nit: the syntax error message for invalid starred expressions should be changed from "can use starred expression only as assignment target".

msg63859 - (view)

Author: Guido van Rossum (gvanrossum) * (Python committer)

Date: 2008-03-18 03:02

This is fun, but needs more work (see python-3000 thread).

I'm setting the priority to low, since I won't hold up a release to get this in (if there's even a rough consensus).

msg65012 - (view)

Author: Thomas Wouters (twouters) * (Python committer)

Date: 2008-04-05 23:59

Updated patch: reworked some internals, and added generalization of functioncalls after talking with Guido. *args is now considered just another positional argument, and can occur anywhere in the positional argument section. It can also occur more than once. Keyword arguments now have to appear after *args, and **kwargs can now occur multiple times at any position in the keyword argument list. test_extcall has some examples.

(The opcodes are largely unaffected; just the order of '*args' and keyword arguments is changed. Behind the scenes, anything after the first '*args' argument is collapsed into a single *args, and everything after the first '**kwargs' is likewise collapsed. The common case (meaning any currently valid syntax, barring the 2to3 fix to swap *args and keyword arguments) does not change in meaning or codepath, just the complex cases are handled differently.)

This is still Work In Progress. To do: implement the dict unpacking syntax (the mechanics are already there for keyword arguments to functioncalls), make sure the precendence of * is correct, get more complete test coverage, iron out the cosmetic bugs in the 2to3 fixer.

Bzr branch for this patch is http://code.python.org/python/users/twouters/starunpack . There is also a branch with just the functioncall changes (although the starunpack changes are a small sprinkling on top of that branch, as it uses the same new mechanics): http://code.python.org/python/users/twouters/funcargs .

msg65018 - (view)

Author: Guido van Rossum (gvanrossum) * (Python committer)

Date: 2008-04-06 04:34

What's dict unpacking? I hope it's not an implementation of this sad idea posted recently:

{'a': x, 'b': y} = {'a': 42, 'b': 'hello'} # Same as x, y = 42, 'hello'

:-)

msg65023 - (view)

Author: Thomas Wouters (twouters) * (Python committer)

Date: 2008-04-06 06:38

No, it's what you asked for in :

How about dicts? kwds = {'z': 0, 'w': 12} {'x': 1, 'y': 2, **kwds} # {'x': 1, 'y': 2, 'z': 0, 'w': 12}

(unpacking of dicts in dicts.)

msg65079 - (view)

Author: Guido van Rossum (gvanrossum) * (Python committer)

Date: 2008-04-07 17:45

I think we should either get this into the 3.0a5 release planned for May 7, or wait for 3.1. I'd prefer to see some kind of PEP discussion on the python-3000 list, rather than just a BDFL approval in a tracker issue. I think it's a useful feature (especially now we already have PEP 3132) but we're getting close to the release, so I'd like to see some more eyeballs on this code... I expect the PEP discussion will be short and sweet -- either most folks like it, or we should not push through at this point in time.

msg65089 - (view)

Author: Thomas Wouters (twouters) * (Python committer)

Date: 2008-04-07 18:28

Agreed. A PEP was already on my TODO list (although I don't mind if someone else picks it up :-) I implemented the dict-unpacking-in-dict-literal syntax in the mean time; it's pushed to the starunpack bzr branch, but I'll add some actual tests before I upload the patch.

msg65095 - (view)

Author: Alexander Belopolsky (belopolsky) * (Python committer)

Date: 2008-04-07 19:00

Thomas,

Could you add BUILD_*UNPACK opcodes documentation to Doc/library/dis.rst? It would also help if you modify CALL_FUNCTION* opcodes' documentation to explain how they will interact with unpacking opcodes.

Do I understand correctly that non-starred arguments are packed into intermediate tuples/sets in the presence of starred arguments so that {a,b,c,d,e} is equivalent to {{a,b},c,{d,e}}? This should not be a problem for tuples, but with sets, it means that {a,b,c} may behave subtly differently from {a,*(b,c)}.

msg65096 - (view)

Author: Guido van Rossum (gvanrossum) * (Python committer)

Date: 2008-04-07 19:07

Do I understand correctly that non-starred arguments are packed into intermediate tuples/sets in the presence of starred arguments so that {a,b,c,d,e} is equivalent to {{a,b},c,{d,e}}? This should not be a problem for tuples, but with sets, it means that {a,b,c} may behave subtly differently from {a,*(b,c)}.

Can you show an example where this would be different?

msg65100 - (view)

Author: Thomas Wouters (twouters) * (Python committer)

Date: 2008-04-07 19:28

On Mon, Apr 7, 2008 at 9:00 PM, Alexander Belopolsky <report@bugs.python.org> wrote:

Alexander Belopolsky <belopolsky@users.sourceforge.net> added the comment:

Thomas,

Could you add BUILD_*UNPACK opcodes documentation to Doc/library/dis.rst? It would also help if you modify CALL_FUNCTION* opcodes' documentation to explain how they will interact with unpacking opcodes.

They don't interact. They're separate opcodes. The CALL_FUNCTION_* opcodes are almost untouched, except the _VAR and _VAR_KW versions: previously, they expected, on the stack, positional arguments followed by keyword/argument pairs followed by the *args sequence followed by the *kwargs mapping (for VAR_KW.) I just changed the order so *args is before the keyword/argument pairs. The change is not related to the BUILD*_UNPACK opcodes, but rather to Guido's request that the order of keyword arguments and args in the functioncall syntax changes. For the order of execution to remain sane, the arguments need to be pushed on the stack in that order, and keeping the _VAR opcode order the same would mean a large amount of ROT_ opcodes ;-P

Updating the docs is on the TODO list.

Do I understand correctly that non-starred arguments are packed into intermediate tuples/sets in the presence of starred arguments so that {a,b,c,d,e} is equivalent to {{a,b},c,{d,e}}? This should not be a problem for tuples, but with sets, it means that {a,b,c} may behave subtly differently from {a,*(b,c)}.

Yes, that's what happens, but only in the presence of *args. For functioncalls, it only happens to everything after the first *args (inclusive.) That means '{a, b, c}' does not change, and neither does 'func(a, b, c)' or 'func(a, b, *c)'. As for sets, I don't see why this would be a problem; there is no difference in the set created by {a, b, c} and the set created by {a, *{b, c}} or {a, *(b, c)}. The arguments are all evaluated in the same order (left to right), and c replaces b, b replaces a if they are considered equal by sets.

msg65109 - (view)

Author: Alexander Belopolsky (belopolsky) * (Python committer)

Date: 2008-04-07 19:53

On Mon, Apr 7, 2008 at 3:07 PM, Guido van Rossum <report@bugs.python.org> wrote:

Can you show an example where this would be different?

Admittedly a contrived example, but

... def hash(self): ... print('hash', self) ... return int.hash(self) ...

a,b,c = map(X, range(3)) {a,b,c} hash 2 hash 1 hash 0 {0, 1, 2} {a,*(b,c)} hash 0 hash 1 hash 2 {0, 1, 2}

msg65110 - (view)

Author: Alexander Belopolsky (belopolsky) * (Python committer)

Date: 2008-04-07 19:58

I missed the first line copying the class definition into my previous post. Class 'X' was defined as follows:

class X(int): def hash(self): print('hash', self) return int.hash(self)

msg65111 - (view)

Author: Thomas Wouters (twouters) * (Python committer)

Date: 2008-04-07 20:02

I'm not sure how this matters. The order of evaluation is the same, the BUILD_SET implementation just hashes the evaluated items in a different order. You can't really rely on that particular order as it's tied closely to the stack representation CPython uses. I also see no practical reason -- or even practical way -- to abuse the hashing order. But you have given me an idea on how to improve some of the code in the BUILD_*_UNPACK opcodes, hmm.

msg65116 - (view)

Author: Guido van Rossum (gvanrossum) * (Python committer)

Date: 2008-04-07 20:14

I agree with Thomas. The order in which hash() is evaluated shouldn't matter to your app; if hash() isn't a pure function (apart from possible caching) you've got worse trouble.

msg65117 - (view)

Author: Alexander Belopolsky (belopolsky) * (Python committer)

Date: 2008-04-07 20:15

On Mon, Apr 7, 2008 at 4:02 PM, Thomas Wouters <report@bugs.python.org> wrote:

.. The order of evaluation is the same, the BUILD_SET implementation just hashes the evaluated items in a different order. You can't really rely on that particular order as it's tied closely to the stack representation CPython uses.

I agree and that's why I said the difference in behavior is "subtle" and my example is "contrived." However, I believe Raymond expressed a similar concern in .

msg65125 - (view)

Author: Thomas Wouters (twouters) * (Python committer)

Date: 2008-04-07 21:33

I don't think the order in which the items are hashed is really what Raymond was worried about. Rather, the size of the stack was, and the effect of having all the items on the stack at the same time. I think Raymond is wrong in this case; while the stack may grow relatively big, we're only talking two pointers here. The items will all have to be created anyway, and in the usual case the number of duplicate keys is low.

My patch actually includes pretty much the same change to BUILD_MAP, because it greatly simplifies the compiler code and gets rid of a lot of extra opcodes -- causing an overal speedup even in the face of large dict literals. But I guess we should take it up with Raymond at some point, perhaps as part of the PEP discussion.

msg67465 - (view)

Author: Georg Brandl (georg.brandl) * (Python committer)

Date: 2008-05-28 21:29

What's the status of this? Is it still under consideration for 3.0 -- if yes, that should get decided before the beta.

msg88533 - (view)

Author: Terry J. Reedy (terry.reedy) * (Python committer)

Date: 2009-05-29 20:39

I was hoping this would make 3.1. Too late, I guess. What about 3.2?

msg88537 - (view)

Author: Guido van Rossum (gvanrossum) * (Python committer)

Date: 2009-05-29 20:52

I was hoping this would make 3.1. Too late, I guess. What about 3.2?

Here's what I said before:

""" I think we should either get this into the 3.0a5 release planned for May 7, or wait for 3.1. I'd prefer to see some kind of PEP discussion on the python-3000 list, rather than just a BDFL approval in a tracker issue. I think it's a useful feature (especially now we already have PEP 3132) but we're getting close to the release, so I'd like to see some more eyeballs on this code... I expect the PEP discussion will be short and sweet -- either most folks like it, or we should not push through at this point in time. """

I still think this is the way to do it. I'm +0 myself. We might decide not to do it just because py3k needs stability more than it needs more features.

msg100293 - (view)

Author: A.M. Kuchling (akuchling) * (Python committer)

Date: 2010-03-02 14:24

Updating version; does someone want to revive this for 3.2?

msg100295 - (view)

Author: R. David Murray (r.david.murray) * (Python committer)

Date: 2010-03-02 14:31

I believe this would be blocked by the moratorium. Correct me if I'm wrong.

msg100296 - (view)

Author: R. David Murray (r.david.murray) * (Python committer)

Date: 2010-03-02 15:22

fix inadvertent reassignment.

msg100304 - (view)

Author: Terry J. Reedy (terry.reedy) * (Python committer)

Date: 2010-03-02 19:12

I would like to see this revisited when the moratorium is lifted. Added 3.3 as a surrogate for that. As per GvR above, it will need a PEP to pin down details, alternatives, and use cases and discussion on python-ideas list.

msg100305 - (view)

Author: Terry J. Reedy (terry.reedy) * (Python committer)

Date: 2010-03-02 19:15

Whoops, just noticed newish 'after moratorium' keyword. Good idea.

msg116416 - (view)

Author: Éric Araujo (eric.araujo) * (Python committer)

Date: 2010-09-14 18:05

Does

yield *it

mean

yield iter(tuple(it))

for i in it: yield i ?

msg149588 - (view)

Author: Terry J. Reedy (terry.reedy) * (Python committer)

Date: 2011-12-16 01:27

In the python-ideas thread "list / array comprehensions extension" Guido replied in reference to an earlier quote from him: "I think that -0 was contextual (too many moving parts for the original Py3k release). Today I am +1."

There are others in favor, pending a PEP that specifies all the details.

msg151143 - (view)

Author: Zbyszek Jędrzejewski-Szmek (zbysz) *

Date: 2012-01-12 18:03

#11682 will likely be merged. The part of this patch about "yielding everything from an iterator" becomes obsolete:

def flatten(iterables): ... for it in iterables: ... yield from it ... L = [ [0,1,2], (3, 4), {5}, {6: None}, (i for i in range(7, 10)) ] list(flatten(L)) [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

The rest is of course still valid and useful.

msg186099 - (view)

Author: Jeff Kaufman (Jeff.Kaufman)

Date: 2013-04-05 18:29

What would it take to get this moving again?

msg186101 - (view)

Author: R. David Murray (r.david.murray) * (Python committer)

Date: 2013-04-05 19:14

Someone needs to write the PEP.

msg191754 - (view)

Author: Alyssa Coghlan (ncoghlan) * (Python committer)

Date: 2013-06-24 12:11

Since this question just came up on python-ideas again, here's a summary of the current status:

The current patch is known to be outdated due to the inclusion of PEP 380 in Python 3.3 ("yield from itr" eliminates any need for "yield *itr")
Since this is a syntax change, a PEP is needed to elaborate on the exact details of what will be added (see PEP 3132 as an example of a PEP that covered a similar change for assignment targets)

msg192984 - (view)

Author: Joshua Landau (Joshua.Landau) *

Date: 2013-07-13 00:49

http://www.python.org/dev/peps/pep-0448/ is out; see what you think.

See http://mail.python.org/pipermail/python-ideas/2013-July/021872.html for all the juicy discussion so far.

msg203191 - (view)

Author: (fhahn)

Date: 2013-11-17 15:15

I've updated the patch to apply to the current tip.

But this patch breaks *-unpacking, I'll try to take a closer look during the next week.

msg234332 - (view)

Author: Neil Girdhar (NeilGirdhar) *

Date: 2015-01-20 00:36

Updated the patch for 3.5.

Currently, building fails with

TypeError: init_builtin() takes exactly 1 argument (0 given)

This is probably due to an argument counting bug, but I am not sure how to debug it.

msg234342 - (view)

Author: Chris Angelico (Rosuav) *

Date: 2015-01-20 04:43

The third version of the patch is huge compared to the other two. Is it all important?

I'm seeing a different build failure, and with the size of patch, I'm not sure I'm well placed to figure out what's going on.

-- cut -- Traceback (most recent call last): File "", line 2420, in _install File "", line 2366, in _setup File "", line 2329, in _builtin_from_name File "", line 1144, in _load_unlocked File "", line 1114, in _load_backward_compatible File "", line 551, in _requires_builtin_wrapper File "", line 1247, in load_module File "", line 321, in _call_with_frames_removed TypeError: init_builtin() takes exactly 1 argument (0 given) Fatal Python error: Py_Initialize: importlib install failed

Current thread 0x00002b7f066c6b20 (most recent call first): Aborted generate-posix-vars failed -- cut --

msg234366 - (view)

Author: Neil Girdhar (NeilGirdhar) *

Date: 2015-01-20 10:33

Hi Chris. It might be hard to notice, but you're seeing the same build failure.

Looking at the patch-to-patch differences, I didn't see anything out of the ordinary. My patch file includes more surrounding lines, dates, and is against a different repository, so it sees a lot of the new matrix multiplication operator stuff, etc. Is there a best practice for creating diff files? I just did hg diff > starunpack3.diff.

Anyway, here's a new patch with the "yield *args" code that has been supplanted by "yield from" removed.

msg234368 - (view)

Author: Chris Angelico (Rosuav) *

Date: 2015-01-20 10:48

facepalm Of course I am. I don't know how I missed that in there, but maybe I was focusing too much on the abort that followed it to actually read the exception text. Duh.

But with the latest version of the patch, I'm seeing something that I'm fairly sure is a different failure:

gcc -pthread -c -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -Werror=declaration-after-statement -I. -IInclude -I./Include -DPy_BUILD_CORE -o Python/ast.o Python/ast.c Python/ast.c: In function ‘ast_for_call’: Python/ast.c:2433:5: error: ‘npositionals’ undeclared (first use in this function) Python/ast.c:2433:5: note: each undeclared identifier is reported only once for each function it appears in make: *** [Python/ast.o] Error 1

msg234370 - (view)

Author: Berker Peksag (berker.peksag) * (Python committer)

Date: 2015-01-20 11:27

Python/ast.c:2433:5: error: ‘npositionals’ undeclared (first use in this function)

Line 2425 should be

int i, nargs, nkeywords, npositionals, ngens;

msg234372 - (view)

Author: Neil Girdhar (NeilGirdhar) *

Date: 2015-01-20 11:54

Yup, that's it. So two problems down:

It has yet to be updated to the most recent Python version It features a now redundant replacement for "yield from" which should be removed

I'm working on:

It also loses support for calling function with keyword arguments before positional arguments, which is an unnecessary backwards-incompatible change.

I'm waiting on some feedback from python-ideas to make sure I know what should be allowed post PEP448.

msg234377 - (view)

Author: Joshua Landau (Joshua.Landau) *

Date: 2015-01-20 16:28

The problem seems to be that with the removal of

```
   else if (TYPE(ch) == STAR) {
```

       vararg = ast_for_expr(c, CHILD(n, i+1));

```
       if (!vararg)
```
```
           return NULL;
```
```
       i++;
```
```
   }
```
```
   else if (TYPE(ch) == DOUBLESTAR) {
```

       kwarg = ast_for_expr(c, CHILD(n, i+1));

```
       if (!kwarg)
```
```
           return NULL;
```
```
       i++;
```
```
   }
```

the code will ignore any subnodes that aren't of type "argument". However, the grammar still says

arglist: (argument ',')* (argument [','] | '*' test [',' '' test] | '' test)

so this is incorrect.

Here's an example of what you might get

inner( "a", argument comma *"bcd", star test comma "e", argument comma f=6, argument comma **{"g": 7}, doublestar test comma h=8, argument comma **{"i":9} doublestar test )

msg234378 - (view)

Author: Neil Girdhar (NeilGirdhar) *

Date: 2015-01-20 16:38

Yes, thank you! That explained it. I am almost done fixing this patch. Here's my progress so far if you want to try it out. Just one test left to fix.

msg234380 - (view)

Author: Neil Girdhar (NeilGirdhar) *

Date: 2015-01-20 17:34

All tests pass for me! Would anyone be kind enough to do a code review?

msg234384 - (view)

Author: Joshua Landau (Joshua.Landau) *

Date: 2015-01-20 19:20

This causes a segmentation fault if any keyword arguments come after a **-unpack. Minimal demo:

f(**x, x=x)

msg234385 - (view)

Author: Joshua Landau (Joshua.Landau) *

Date: 2015-01-20 19:31

Just change

if (!PyUnicode_Compare(tmp, key)) {

when iterating over prior keyword arguments to

if (tmp && !PyUnicode_Compare(tmp, key)) {

since tmp (the argument's name) can now be NULL.

msg234386 - (view)

Author: Joshua Landau (Joshua.Landau) *

Date: 2015-01-20 19:40

I take it back; that just causes

>>> f(**{}, c=2)
XXX lineno: 1, opcode: 105
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
SystemError: unknown opcode

msg234393 - (view)

Author: Neil Girdhar (NeilGirdhar) *

Date: 2015-01-20 21:03

Thanks. It's probably compile.c under "/* Same dance again for keyword arguments */". nseen remains zero and probably shouldn't. I need to learn more about the opcodes.

msg234394 - (view)

Author: Joshua Landau (Joshua.Landau) *

Date: 2015-01-20 21:39

I think I've got it working; I'm just working out how to make a patch and adding a test or two. I think I'll also need to sign the contributor agreement.

While I'm at it, here are a few other deviations from the PEP:

{*()} and {**{}} aren't supported
[*[0] for i in [0]] gives a SystemError
"return (1, 2, 3)," fails whilst "(1, 2, 3)," succeeds

msg234397 - (view)

Author: Joshua Landau (Joshua.Landau) *

Date: 2015-01-20 22:16

This was a rather minor fix; I basically moved from STORE_SUBSCR to STORE_MAP and fixed a BUILD_MAP opcode.

msg234404 - (view)

Author: Neil Girdhar (NeilGirdhar) *

Date: 2015-01-20 23:16

Post it? It's just "hg diff > a.diff"

msg234405 - (view)

Author: Neil Girdhar (NeilGirdhar) *

Date: 2015-01-20 23:19

I think there will still be a problem ceval with the way the dicts are combined unfortunately, but that should be easy to fix.

msg234406 - (view)

Author: Joshua Landau (Joshua.Landau) *

Date: 2015-01-20 23:20

Aye, I'd done so (see starunpack7.diff). It was the fuss to reapply it ontop of your newer diff and making sure I'd read at least some of the devguide before barging on.

Anyhow, here's another small fix to deal with the [*[0] for i in [0]] problem. I'm not nearly confident I can touch the grammar, though, so the other problems are out of my reach.

msg234408 - (view)

Author: Neil Girdhar (NeilGirdhar) *

Date: 2015-01-20 23:40

Thanks!

I've incorporated your changes to deal with the [*[0] for i in [0]] problem, although I don't understand them yet.

The problem with using STORE_MAP is you create a new dict for each keyword argument in that situation. I optimized that away. Good catch on the BUILD_MAP opcode problem. I could not figure out why that wasn't working!

I added some tests. Did you say you had some tests?

One of the tests that both of our code is failing on still is:

>>> def f(x, y):
...     print(x, y)
...
>>> f(x=5, **{'x': 1}, **{'x': 3}, y=2)

It's just a problem in ceval that I'll work on now.

msg234409 - (view)

Author: Joshua Landau (Joshua.Landau) *

Date: 2015-01-20 23:46

I'm getting

>>> f(x=5, **{'x': 1}, **{'x': 3}, y=2)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: f() got multiple values for keyword argument 'x'

Which, as I understand, is the correct result. I'm using starunpack8.diff right now.

msg234411 - (view)

Author: Neil Girdhar (NeilGirdhar) *

Date: 2015-01-20 23:51

Why is that correct? The PEP mentions overriding. Right now each dict overrides values from the last silently, which I think makes sense. The keyword arguments you pass in override keys from previous dicts (also good I think). The problem is that you can pass multiple duplicate keyword arguments, and the one below, which I think should succeed.

msg234413 - (view)

Author: Guido van Rossum (gvanrossum) * (Python committer)

Date: 2015-01-20 23:58

Let's tread careful here. In regular dicts, for better or for worse, {'x': 1, 'x': 2} is defined and returns {'x': 2}. But in keyword arg processing, duplicates are always rejected. This may be an area where we need to adjust the PEP to match that expectation.

On Tue, Jan 20, 2015 at 3:51 PM, Neil Girdhar <report@bugs.python.org> wrote:

Neil Girdhar added the comment:

Why is that correct? The PEP mentions overriding. Right now each dict overrides values from the last silently, which I think makes sense. The keyword arguments you pass in override keys from previous dicts (also good I think). The problem is that you can pass multiple duplicate keyword arguments, and the one below, which I think should succeed.

Python tracker <report@bugs.python.org> <http://bugs.python.org/issue2292>

msg234414 - (view)

Author: Neil Girdhar (NeilGirdhar) *

Date: 2015-01-21 00:08

That makes sense.

If you wanted to override, you could always write:

f(**{**a, **b, 'x': 5})

rather than

f(**a, **b, x=5)

Should I go ahead and fix it so that overriding is always wrong? E.g.,

f(**{'x': 3}, **{'x': 4})

which currently works?

msg234415 - (view)

Author: Guido van Rossum (gvanrossum) * (Python committer)

Date: 2015-01-21 00:10

SGTM

On Tue, Jan 20, 2015 at 4:08 PM, Neil Girdhar <report@bugs.python.org> wrote:

Neil Girdhar added the comment:

That makes sense.

If you wanted to override, you could always write:

f(**{**a, **b, 'x': 5})

rather than

f(**a, **b, x=5)

Should I go ahead and fix it so that overriding is always wrong? E.g.,

f(**{'x': 3}, **{'x': 4})

which currently works?

Python tracker <report@bugs.python.org> <http://bugs.python.org/issue2292>

msg234416 - (view)

Author: Joshua Landau (Joshua.Landau) *

Date: 2015-01-21 00:17

The problem with using STORE_MAP is you create a new dict for each keyword argument in that situation.

You don't; if you look at the disassembly for producing a built-in dict ("dis.dis('{1:2, 2:3, 3:4}')") you'll see they use STORE_MAP too. STORE_MAP seems to just be the map equivalent of LIST_APPEND.

I've done simple timings that show my version being faster...

Unfortunately, it points out there is definitely a memory leak. This reproduces:

def f(a):
    pass

while True:
    f(**{}, a=1)

This goes for both patches 8 and 9.

msg234418 - (view)

Author: Neil Girdhar (NeilGirdhar) *

Date: 2015-01-21 00:40

Could you try this and tell me how many BUILD_MAPs you're doing?

dis.dis("def f(w, x, y, z, r): pass\nf(w=1, **{'x': 2}, y=3, z=4, r=5)")

Mine does 2.

msg234419 - (view)

Author: Joshua Landau (Joshua.Landau) *

Date: 2015-01-21 00:44

2 here as well:

15 LOAD_CONST 2 ('w') 18 LOAD_CONST 3 (1) 21 BUILD_MAP 1 24 LOAD_CONST 4 (2) 27 LOAD_CONST 5 ('x') 30 STORE_MAP 31 BUILD_MAP 1 34 LOAD_CONST 6 (3) 37 LOAD_CONST 7 ('y') 40 STORE_MAP 41 LOAD_CONST 8 (4) 44 LOAD_CONST 9 ('z') 47 STORE_MAP 48 LOAD_CONST 10 (5) 51 LOAD_CONST 11 ('r') 54 STORE_MAP 55 BUILD_MAP_UNPACK 2

msg234420 - (view)

Author: Neil Girdhar (NeilGirdhar) *

Date: 2015-01-21 00:47

If there is a speed issue, the real answer I think is to add an opcode as suggested in the source code that coalesces keyword arguments into dicts rather than "the weird dance" as the previous authors described it, or turning each argument into an individual dict and merging them at the end as you're doing…

msg234421 - (view)

Author: Neil Girdhar (NeilGirdhar) *

Date: 2015-01-21 00:50

Ah, nice! I didn't realize what STORE_MAP did. I thought it created a map each time. We'll just do it your way.

msg234422 - (view)

Author: Neil Girdhar (NeilGirdhar) *

Date: 2015-01-21 01:26

Detecting overrides and raising TypeError. E.g.,

>>> def f(x, y):
...     print(x, y)
...
>>> f(x=5, **{'x': 3}, y=2)
Traceback (most recent call last):
  ...
TypeError: f() got multiple values for keyword argument 'x'

msg234432 - (view)

Author: Neil Girdhar (NeilGirdhar) *

Date: 2015-01-21 10:49

Added many tests, six of which fail. Started work on grammar to fix new tests.

msg234436 - (view)

Author: Joshua Landau (Joshua.Landau) *

Date: 2015-01-21 13:46

Some of the tests seemed to be failing simply because they were incorrect. This fixes that.

msg234445 - (view)

Author: Joshua Landau (Joshua.Landau) *

Date: 2015-01-21 21:07

I think I've fixed the memory leaks (plural).

There were also a host of other problems with the _UNPACK opcodes in ceval. Here are the things I remember fixing, although I think I did slightly more:

Not throwing an error when PyDict_New or PyDict_Update fails.
Not doing Py_DECREF on stack items being popped.
Not checking if intersection is non-NULL.
Not doing Py_DECREF on intersection.

Now the primary problem is giving good errors; I don't know how to make them look like they came from the function call. I'm not sure I want to, either, since these opcodes are used elsewhere.

I do need to check something about this (what requirements are there on how you leave the stack when you "goto error"?), but that's an easy fix if my current guess isn't right.

msg234457 - (view)

Author: Neil Girdhar (NeilGirdhar) *

Date: 2015-01-22 00:01

Very nice! So what's left besides errors?

Fixing the grammar, ast, and compilation for the list, dict, and set comprehension element unpackings

Now the primary problem is giving good errors; I don't know how to make them look like they came from the function call. I'm not sure I want to, either, since these opcodes are used elsewhere.

The _UNPACK opcodes are new in this changelist. You can do "hg vdiff" to give a side-by-side diff or just check in the patch review.

msg234458 - (view)

Author: Joshua Landau (Joshua.Landau) *

Date: 2015-01-22 00:03

The _UNPACK opcodes are new in this changelist.

Yup, but they're used in the other unpacking syntax too:

(*(1, 2, 3), *(4, 5, 6))

msg234459 - (view)

Author: Neil Girdhar (NeilGirdhar) *

Date: 2015-01-22 01:34

Oh, I see. For BUILD_MAP_UNPACK we don't want to raise on duplicate dict comprehension element unpackings, right? Maybe we should add a different opcode, or else a flag to the opcodes, or else use the top bit of the length parameter? What do you think?

msg234460 - (view)

Author: Joshua Landau (Joshua.Landau) *

Date: 2015-01-22 01:44

Good catch.

CALL_FUNCTION seems to split its opcode into two to give it a positional-keyword pair so this seems fine. I'd hope we can do the same thing; personally I would do:

BUILD_MAP_UNPACK(
    position_of_function_in_stack_or_0 << 8 |
    number_to_pack
)

This way if building for a function we can do the check and give good errors that match the ones raised from CALL_FUNCTION. When the top 8 bits are 0, we don't do checks. What do you think? Would dual-usage be too confusing?

msg234462 - (view)

Author: Neil Girdhar (NeilGirdhar) *

Date: 2015-01-22 02:27

I am a huge fan of giving good errors. Looks good to me. Will we need to make sure that the call helper function we worked on produces additional BUILD_MAP_UNPACK opcodes every 256 dictionaries just in case?

msg234463 - (view)

Author: Joshua Landau (Joshua.Landau) *

Date: 2015-01-22 02:38

Functions are already limited to 255 arguments, so I don't think so.

msg234464 - (view)

Author: Neil Girdhar (NeilGirdhar) *

Date: 2015-01-22 02:39

Another option to consider is to just use a bit on the BUILD_MAP_UNPACK and then have a stack marking opcode at the function call (not sure what to call it, but say FUNCTION_CALL_MARK)

The advantage would be you don't store or calculate relative stack positions. When the interpreter sees the mark, it stores the function call address for use in BUILD_MAP_UNPACK errors.

Although I guess you have 24 bits to store the relative stack position?

msg234465 - (view)

Author: Joshua Landau (Joshua.Landau) *

Date: 2015-01-22 02:43

According to the standard, int can be only 16 bits long so that only leaves 255/255. However, if the offset is on top of the dictionary count, this is easily enough to clear the limits for the maximum function size (worst case is a merge of 255 dicts with an offset of 1 or a merge of 2 dicts with an offset of 254).

msg234466 - (view)

Author: Neil Girdhar (NeilGirdhar) *

Date: 2015-01-22 02:52

I see your point: if there are 255 dictionaries, there's no room for neither preceding keyword arguments nor positional arguments. Okay, then I like your solution.

msg234467 - (view)

Author: Neil Girdhar (NeilGirdhar) *

Date: 2015-01-22 02:57

Also maybe not in this changelist, but we should consider replacing STORE_MAP and BUILD_MAP with a single opcode BUILD_MAP(n) that produces a dict out of the top n items on the stack just like BUILD_LIST(n) does. What do you think?

msg234468 - (view)

Author: Joshua Landau (Joshua.Landau) *

Date: 2015-01-22 03:13

We wouldn't want to replace STORE_MAP since that's used in dictionary comprehensions, but replacing BUILD_MAP with BUILD_MAP(n) sounds like a great idea.

msg234469 - (view)

Author: Neil Girdhar (NeilGirdhar) *

Date: 2015-01-22 03:18

You're right.

msg234483 - (view)

Author: Neil Girdhar (NeilGirdhar) *

Date: 2015-01-22 08:27

BUILD_MAP(n)

msg234489 - (view)

Author: Neil Girdhar (NeilGirdhar) *

Date: 2015-01-22 12:14

By the way, Joshua if you wanted to edit the text of the PEP, it might be nice to point out that this replaces itertools.chain.from_iterable. I know you mention one use of itertools.chain, but I think this nicely replaces all uses of both:

itertools.chain(a, b, c) is (*x for x in [a, b, c]) itertools.chain.from_iterable(it) is (*x for x in it)

msg234493 - (view)

Author: Joshua Landau (Joshua.Landau) *

Date: 2015-01-22 14:32

I've looked at BUILD_MAP(n). It seems to work and has speed improvements but:

I was wrong about the 16-bit int thing. It turns out CPython is happily treating them as 32 bit as long as they are prefixed by an EXTENDED_ARG bytecode

https://docs.python.org/3/library/dis.html#opcode-EXTENDED_ARG

This could be used by BUILD_MAP rather than falling back to BUILD_MAP_UNPACK.

It's probably best to not include it here, since it's a disjoint issue. This patch wouldn't really be affected by its absence.

Also, if we limit BUILD_MAP_MERGE to 255 dictionaries, this will also apply to the {**a, **b, **c, ...} syntax, although I really can't see it being a problem.

msg234516 - (view)

Author: Neil Girdhar (NeilGirdhar) *

Date: 2015-01-22 22:05

In that case, another option would be to use that to send the "number of maps" to CALL_FUNCTION and let it do the BUILD_MAP_UNPACK stuff itself. Would that simplify your ideas regarding error handling?

msg234518 - (view)

Author: Joshua Landau (Joshua.Landau) *

Date: 2015-01-22 22:10

Why would that simplify things?

msg234519 - (view)

Author: Joshua Landau (Joshua.Landau) *

Date: 2015-01-22 22:13

I phrased that badly. Whilst I can see minor simplifications to BUILD_MAP_UNPACK, the only way to add more information to CALL_FUNCTION_XXX would be through EXTENDED_ARG. This seems like it would outweigh any benefits, and the tiny duplication of error checking removed would be far dwarfed by the unpacking code in CALL_FUNCTION_XXX.

msg234520 - (view)

Author: Neil Girdhar (NeilGirdhar) *

Date: 2015-01-22 22:27

Sorry, I don't know enough about how you were planning on using the stack pointer difference to produce good errors. I thought that if you waited for the CALL_FUNCTION to be happening before reporting errors about duplicate parameters it might simplify that code.

msg234521 - (view)

Author: Joshua Landau (Joshua.Landau) *

Date: 2015-01-22 22:36

I imagine it like (in the map unpacking code)

func_offset = (oparg >> 8) & 0xFF;
num_maps = oparg & 0xFF;

// later
if (func_offset) {
    // do checks
    if (repeated_argument) {
        raise_error_from_function(PEEK(func_offset + num_maps));
    }
}

This code should be relatively quick, in an already-slow opcode, and rather short.

If adding to CALL_FUNCTION_XXX, you would have to add an EXTENDED_ARG opcode (because CALL_FUNCTION_XXX already uses the bottom 16 bits), add checks for the top bits in the opcode, duplicate the (large) dictionary merging function. This doesn't seem like it saves much work.

msg234522 - (view)

Author: Neil Girdhar (NeilGirdhar) *

Date: 2015-01-22 22:42

Okay, I didn't realize it was so simple to raise the error from somewhere else. Regarding "duplicate the (large) dictionary merging function" — of course we would just factor it out into a function.

msg234528 - (view)

Author: Joshua Landau (Joshua.Landau) *

Date: 2015-01-22 23:41

We wouldn't actually need to raise it "from" somewhere else; the line numbering and frame are already correct. The only difficulty is that the traceback currently says

# func(a=1, **{'a': 1})
TypeError: func() got multiple values for keyword argument 'arg'
           ↑↑↑↑

To do this from the UNPACK opcode would require knowing where the function is in order to print its name. (We also need to know whether to do the check at all, so we'd be hijacking some bits the UNPACK opcode anyway.)

msg234529 - (view)

Author: Neil Girdhar (NeilGirdhar) *

Date: 2015-01-23 00:18

That's true. But wouldn't the offset always be one (or three or whatever) since if we do BUILD_MAP_UNPACK in a function call it's always right before CALL_FUNCTION?

msg234530 - (view)

Author: Joshua Landau (Joshua.Landau) *

Date: 2015-01-23 00:24

The stack will have the function, then any number of positional arguments, then optionally an *args, then any number (>= 2) of maps to unpack. To get to the function, you need to know the sum count of all of these.

msg234531 - (view)

Author: Neil Girdhar (NeilGirdhar) *

Date: 2015-01-23 01:14

What do you mean by the stack will "have the function"? At the point that you're doing BUILD_MAP_UNPACK, CALL_FUNCTION hasn't been executed…

msg234537 - (view)

Author: Joshua Landau (Joshua.Landau) *

Date: 2015-01-23 02:59

The function object that's on the stack.

msg234538 - (view)

Author: Neil Girdhar (NeilGirdhar) *

Date: 2015-01-23 03:01

when does that get pushed on the stack?

msg234539 - (view)

Author: Joshua Landau (Joshua.Landau) *

Date: 2015-01-23 03:08

Just before any arguments to the function.

msg234540 - (view)

Author: Neil Girdhar (NeilGirdhar) *

Date: 2015-01-23 03:15

Oh, thanks I see, by the LOAD_BUILD_CLASS or VISIT(c, expr, e->v.Call.func) ?

Ok, I see. Why do the errors work now for f(x=1, **{'x': 2}) — these are happening at BUILD_MAP_UNPACK without a stack pointer adjustment. For me, the error message mentions 'f' by name. Is that not the error message you're trying to fix?

msg234541 - (view)

Author: Joshua Landau (Joshua.Landau) *

Date: 2015-01-23 04:03

No, that happens in CALL_FUNCTION_KW:

import dis dis.dis("f(x=1, **{'x': 1})") 1 0 LOAD_NAME 0 (f) 3 LOAD_CONST 0 ('x') 6 LOAD_CONST 1 (1) 9 LOAD_CONST 1 (1) 12 LOAD_CONST 0 ('x') 15 BUILD_MAP 1 18 CALL_FUNCTION_KW 256 (0 positional, 1 keyword pair) 21 RETURN_VALUE

There's no call to BUILD_MAP_UNPACK at all. Namely, it's raised from update_keyword_args (in turn from ext_do_call).

--- Tangential note: ---

In fact, it seems the only reason we keep the mess of unpacking in two places rather than just using BUILD_TUPLE_UNPACK and BUILD_MAP_UNPACK unconditionally is that CALL_FUNCTION_XXX looks to be slightly more efficient by only dealing with the case of a single unpack at the end. I think I see how to make the _UNPACKs fast enough for this case, though, so maybe we could remove it and unify a few things. I'd need to write it up, though.

msg234543 - (view)

Author: Neil Girdhar (NeilGirdhar) *

Date: 2015-01-23 05:26

Ah, sorry, yes, this is what I meant (and I see what your'e trying to fix now!):

import dis def f(x,y): pass ... dis.dis("f(y=1, **{'x': 1}, x=2)") 1 0 LOAD_NAME 0 (f) 3 LOAD_CONST 0 ('y') 6 LOAD_CONST 1 (1) 9 LOAD_CONST 1 (1) 12 LOAD_CONST 2 ('x') 15 BUILD_MAP 1 18 LOAD_CONST 3 (2) 21 LOAD_CONST 2 ('x') 24 BUILD_MAP 1 27 BUILD_MAP_UNPACK 2 30 CALL_FUNCTION_KW 256 (0 positional, 1 keyword pair) 33 RETURN_VALUE f(y=1, **{'x': 1}, x=2) Traceback (most recent call last): File "", line 1, in TypeError: () got multiple values for keyword argument 'x'

msg234544 - (view)

Author: Neil Girdhar (NeilGirdhar) *

Date: 2015-01-23 05:28

I was also thinking of unifying it all, but I couldn't find a way to ensure that the most common case (which I assume is f(x, y, z=0, w=0, *args, **kwargs)) produces no additional opcodes. If you have a nice unification, I look forward to it.

msg234663 - (view)

Author: Neil Girdhar (NeilGirdhar) *

Date: 2015-01-25 11:57

Dict displays work.

msg234668 - (view)

Author: Neil Girdhar (NeilGirdhar) *

Date: 2015-01-25 13:52

Dict comprehensions work.

msg234669 - (view)

Author: Neil Girdhar (NeilGirdhar) *

Date: 2015-01-25 14:26

All tests pass, polishing. Joshua, I'm still not sure how to do show the right error for the function call with duplicate arguments…

msg234670 - (view)

Author: Neil Girdhar (NeilGirdhar) *

Date: 2015-01-25 15:35

Fixed all tests except test_parser. New node numbers are not reflected here for some reason.

msg234681 - (view)

Author: Joshua Landau (Joshua.Landau) *

Date: 2015-01-25 18:42

Amazing, thanks.

I also just uncovered http://bugs.python.org/issue23316; we'll need to support a patch for that. In fact, bad evaluation order is why I haven't yet gotten down my unification strategy. I wouldn't worry about extra opcodes when using *args or **kwargs, though; what matters is mostly avoiding extra copies. I guess a few timeits will show whether I'm right or totally off-base.

Most of what's needed for the error stuff is already implemented; one just needs to set the top bit flag (currently just 1<<8) to "1 + arg_count_on_stack()", which is a trivial change. I'll push a patch for that after I'm done fiddling with the unification idea.

msg234711 - (view)

Author: Neil Girdhar (NeilGirdhar) *

Date: 2015-01-26 01:36

Sounds good. I'll stop working until I see your patch. I tried to make it easy for you to implement your error idea wrt BUILD_MAP_UNPACK :)

msg234754 - (view)

Author: Neil Girdhar (NeilGirdhar) *

Date: 2015-01-26 16:42

Is this correct?

f(*i for i in ['abc', 'def']) ['a', 'b', 'c', 'd', 'e', 'f'] [] f(**i for i in ['abc', 'def']) File "", line 1 f(**i for i in ['abc', 'def']) ^ SyntaxError: invalid syntax

Should neither work? Both?

msg234755 - (view)

Author: Neil Girdhar (NeilGirdhar) *

Date: 2015-01-26 16:42

def f(a, *b): print(list(a), list(b))

msg234757 - (view)

Author: Guido van Rossum (gvanrossum) * (Python committer)

Date: 2015-01-26 16:44

Both. On Jan 26, 2015 8:42 AM, "Neil Girdhar" <report@bugs.python.org> wrote:

Neil Girdhar added the comment:

Is this correct?

f(*i for i in ['abc', 'def']) ['a', 'b', 'c', 'd', 'e', 'f'] [] f(**i for i in ['abc', 'def']) File "", line 1 f(**i for i in ['abc', 'def']) ^ SyntaxError: invalid syntax

Should neither work? Both?

Python tracker <report@bugs.python.org> <http://bugs.python.org/issue2292>

msg234758 - (view)

Author: Neil Girdhar (NeilGirdhar) *

Date: 2015-01-26 16:54

Could you help me understand this a bit better?

I always thought of f(x for x in l) as equivalent to f( (x for x in l) ).

So, I can see that f(*x for x in l) should be equivalent to f( (*x for x in l) ).

How should we interpret f(**x for x in l)? Is it then f( {**x for x in l} )?

msg234759 - (view)

Author: Guido van Rossum (gvanrossum) * (Python committer)

Date: 2015-01-26 16:55

Wait, with that f() definition I'd expect a different result from the former -- each letter as a new arg. Right? On Jan 26, 2015 8:42 AM, "Neil Girdhar" <report@bugs.python.org> wrote:

Neil Girdhar added the comment:

def f(a, *b): print(list(a), list(b))

Python tracker <report@bugs.python.org> <http://bugs.python.org/issue2292>

msg234760 - (view)

Author: Guido van Rossum (gvanrossum) * (Python committer)

Date: 2015-01-26 16:56

Let's wait until I have a keyboard. In 20 min. On Jan 26, 2015 8:55 AM, "Guido van Rossum" <guido@python.org> wrote:

Wait, with that f() definition I'd expect a different result from the former -- each letter as a new arg. Right? On Jan 26, 2015 8:42 AM, "Neil Girdhar" <report@bugs.python.org> wrote:

Neil Girdhar added the comment:

def f(a, *b): print(list(a), list(b))

Python tracker <report@bugs.python.org> <http://bugs.python.org/issue2292>

msg234764 - (view)

Author: Joshua Landau (Joshua.Landau) *

Date: 2015-01-26 18:28

If we're supporting

f(**x for x in y)

surely we should also support

f(x: y for x, y in z)

I personally don't like this idea.

msg234766 - (view)

Author: Guido van Rossum (gvanrossum) * (Python committer)

Date: 2015-01-26 18:38

So I think the test function here should be:

def f(*a, **k): print(list(a), list(k))

Then we can try things like:

f(x for x in ['ab', 'cd'])

which prints a generator object, because this is interpreted as an argument that's a generator expression.

But now let's consider:

f(*x for x in ['ab', 'cd'])

I personally expected this to be equivalent to:

f(*'ab', *'cd')

IOW:

f('a', 'b', 'c', 'd')

However, it seems your current patch (#18) interprets it as still passing a single argument which is the generator expression (*x for x in ['ab', 'cd']) which in turn is equivalent to iter(['a', 'b', 'c', 'd']), IOW f() is called with a single argument that is a generator.

The PEP doesn't give clarity on what to do here. The question now is, should we interpret things like *x for x in ... as an extended form of generator expression, or as an extended form of *arg? I somehow think the latter is more useful and also the more logical extension.

My reasoning is that the PEP supports things like f(*a, *b) and it would be fairly logical to interpret f(*x for x in xs) as doing the *x thing for each x in the list xs.

I think this same interpretation works for [*x for x in xs] and {*x for x in xs}, and we can also extend it to ** in {} and in calls (obviously ** has no meaning in list comprehensions or generator expressions).

BTW I think I found another bug in patch #18:

{**1} 1

That should be an error.

An edge case I'm not sure about: should {**x} accept an iterable of (key, value) pairs, like dict(x)?

msg234767 - (view)

Author: Guido van Rossum (gvanrossum) * (Python committer)

Date: 2015-01-26 18:41

@Joshua: Re: f(x: y for x, y in z)

I don't think that follows at all.

We support f(**d) now, but if d == {'a': 1, 'b': 2}, it is equivalent to f(a=1, b=2), not f(a: 1, b: 2).

msg234771 - (view)

Author: Joshua Landau (Joshua.Landau) *

Date: 2015-01-26 19:18

Quick-fix for Guido's bug attached. I'm not familiar with this part of the code, yet, so take this tentatively. I just changed

while (containers > 1) {

while (containers) {

@Guido

My comments were assuming f(**x for x in y) meant f({**x for x in y}).

I see your reasoning, but I don't like how your version has

 (x for y in z for x in y) ==  (*y for y in z)
f(x for y in z for x in y) != f(*y for y in z)

This seems like a tripping point. I've never wanted to unpack a 2D iterable into an argument list, so personally I'm not convinced by the value-add either.

msg234772 - (view)

Author: Joshua Landau (Joshua.Landau) *

Date: 2015-01-26 19:37

Update for the error messages fix.

I've put aside the idea of unifying things for now because there are a couple of interdependencies I wasn't expecting and I absolutely don't want the fast-path for f(x) to get slower.

msg234785 - (view)

Author: Neil Girdhar (NeilGirdhar) *

Date: 2015-01-26 22:36

fixed a minor bug with the function address, and made a number of polishing changes.

msg234800 - (view)

Author: Neil Girdhar (NeilGirdhar) *

Date: 2015-01-26 23:53

Everything seems to work except two tests are still failing: parser and venv. Any ideas Joshua?

Also, we have to decide what to do with f(*x for x in l) and f(**x for x in l).

msg234914 - (view)

Author: Neil Girdhar (NeilGirdhar) *

Date: 2015-01-28 20:42

Fixed a couple bugs and added a test.

Incremented the magic number.

msg234916 - (view)

Author: Neil Girdhar (NeilGirdhar) *

Date: 2015-01-28 21:46

Just need to fix the parser now. Minimal example:

parser.sequence2st(parser.expr("{1}").totuple()) Traceback (most recent call last): File "", line 1, in parser.ParserError: Expected node type 12, got 302.

msg235009 - (view)

Author: Neil Girdhar (NeilGirdhar) *

Date: 2015-01-30 00:32

All tests pass.

@Guido: can we get some clarification on f(*… and f(**…? One option is to make them illegal for now and then open them up in a future PEP when it's more clear what's wanted?

msg235018 - (view)

Author: Guido van Rossum (gvanrossum) * (Python committer)

Date: 2015-01-30 02:46

Hi Neil,

I think disallowing them is the best approach. There doesn't seem to be a good use case that would be thwarted. Hopefully the syntactic prohibition isn't too hard to implement.

On Thu, Jan 29, 2015 at 4:32 PM, Neil Girdhar <report@bugs.python.org> wrote:

Neil Girdhar added the comment:

All tests pass.

@Guido: can we get some clarification on f(*… and f(**…? One option is to make them illegal for now and then open them up in a future PEP when it's more clear what's wanted?

Added file: http://bugs.python.org/file37911/starunpack25.diff

Python tracker <report@bugs.python.org> <http://bugs.python.org/issue2292>

msg235029 - (view)

Author: Neil Girdhar (NeilGirdhar) *

Date: 2015-01-30 08:45

Ready for a code review:

Blocked f(*x for x…) as requested.

Polished up parsermodule.c.

msg235030 - (view)

Author: Neil Girdhar (NeilGirdhar) *

Date: 2015-01-30 11:42

Fixed a bug and added a test.

msg235031 - (view)

Author: Neil Girdhar (NeilGirdhar) *

Date: 2015-01-30 11:54

Is it possible to edit the PEP to reflect the current design decisions?

Specifically:

Remove: "Because of the new levity for * and ** unpackings, it may be advisable to lift some or all of these restrictions." (in both abstract and specification)
Extend: "Currently duplicate arguments raise TypeError. This remains true for duplicate arguments provided through multiple keyword argument unpackings, e.g. f(**{'x': 2}, **{'x': 3})
Add some examples of dictionary overriding to the list of examples:

{'x': 1, **{'x': 2}} {'x': 2}

{**{'x': 2}, 'x': 1} {'x': 1}

Remove "if the restrictions are kept" (they are)
Remove "If they are removed completely..."
In disadvantages, remove "if the current are kept" (they are). Don't write "* unpackings", write "iterable unpackings"
Remove "if the current restrictions are lifted"
Remove "Implementation" section (it's done!)
Add to specification: "f(*x for x in it) and f(**x for x in it)" remain SyntaxErrors.

msg235038 - (view)

Author: Neil Girdhar (NeilGirdhar) *

Date: 2015-01-30 16:01

Fixed a bug in ceval.c; added a test to test_unpack_ex.

msg235040 - (view)

Author: Guido van Rossum (gvanrossum) * (Python committer)

Date: 2015-01-30 16:22

For the PEP update, please check out the PEP repo at hg.python.org and send a patch to peps@python.org. On Jan 30, 2015 3:54 AM, "Neil Girdhar" <report@bugs.python.org> wrote:

Neil Girdhar added the comment:

Is it possible to edit the PEP to reflect the current design decisions?

Specifically:

Remove: "Because of the new levity for * and ** unpackings, it may be advisable to lift some or all of these restrictions." (in both abstract and specification)

Extend: "Currently duplicate arguments raise TypeError. This remains true for duplicate arguments provided through multiple keyword argument unpackings, e.g. f(**{'x': 2}, **{'x': 3})

Add some examples of dictionary overriding to the list of examples:

{'x': 1, **{'x': 2}} {'x': 2}

{**{'x': 2}, 'x': 1} {'x': 1}

Remove "if the restrictions are kept" (they are)

Remove "If they are removed completely..."

In disadvantages, remove "if the current are kept" (they are). Don't write "* unpackings", write "iterable unpackings"

Remove "if the current restrictions are lifted"

Remove "Implementation" section (it's done!)

Add to specification: "f(*x for x in it) and f(**x for x in it)" remain SyntaxErrors.

Python tracker <report@bugs.python.org> <http://bugs.python.org/issue2292>

msg235041 - (view)

Author: Neil Girdhar (NeilGirdhar) *

Date: 2015-01-30 16:36

Another bug, another test.

msg235058 - (view)

Author: Joshua Landau (Joshua.Landau) *

Date: 2015-01-30 21:48

Special-cased (*i for i in x) to use YIELD_FROM instead of looping. Speed improved, albeit still only half as fast as chain.from_iterable.

Fixed error message check in test_syntax and removed semicolons.

msg235281 - (view)

Author: Ethan Furman (ethan.furman) * (Python committer)

Date: 2015-02-02 18:52

Tried running tests, tests that failed with patch:

test_ast
test_collections
test_extcall
test_grammar
test_importlib
test_parser
test_syntax
test_unpack_ex
test_zipfile

Running Ubuntu 13.04 (GNU/Linux 3.8.0-22-generic x86_64).

Should I copy/paste the errors, email them, or what? It's going to be a wall of text.

msg235292 - (view)

Author: Joshua Landau (Joshua.Landau) *

Date: 2015-02-02 22:09

I don't know the etiquette rules for the issue tracker, but I'd really appreciate having something to debug -- it's working for me, you see.

msg235297 - (view)

Author: Ethan Furman (ethan.furman) * (Python committer)

Date: 2015-02-02 22:37

Argh -- I applied the patch, but didn't recompile. Doing that now...

msg235302 - (view)

Author: Terry J. Reedy (terry.reedy) * (Python committer)

Date: 2015-02-03 00:02

Upload a .txt file if there is really too much for a message.

msg235310 - (view)

Author: Ethan Furman (ethan.furman) * (Python committer)

Date: 2015-02-03 00:08

Thanks, Terry! I'll do that next time -- after I make sure and re-compile. :/

msg235368 - (view)

Author: Ethan Furman (ethan.furman) * (Python committer)

Date: 2015-02-04 01:09

All test pass on my system. :)

msg235641 - (view)

Author: Benjamin Peterson (benjamin.peterson) * (Python committer)

Date: 2015-02-09 22:23

For starters, it would be nice if the patch didn't make unrelated style changes (e.g. in compile.c). I also Call arguments should be unified into one list rather than distinguishing between "keywords" and "args" anymore.

msg235646 - (view)

Author: Neil Girdhar (NeilGirdhar) *

Date: 2015-02-09 23:15

Thank you, Benjamin.

It's my nature to keep code consistent/clean, but I realize that I can get carried away. Should I revert all incidental PEP 7 style changes?

Regarding the args/keywords, where do you mean? If you're talking about compile.c, we can't merge them because the call_function operand expects to see the positional arguments below the keyword arguments on the stack.

msg235647 - (view)

Author: Benjamin Peterson (benjamin.peterson) * (Python committer)

Date: 2015-02-09 23:17

On Mon, Feb 9, 2015, at 18:15, Neil Girdhar wrote:

Neil Girdhar added the comment:

Thank you, Benjamin.

It's my nature to keep code consistent/clean, but I realize that I can get carried away. Should I revert all incidental PEP 7 style changes?

Yes, please.

Regarding the args/keywords, where do you mean? If you're talking about compile.c, we can't merge them because the call_function operand expects to see the positional arguments below the keyword arguments on the stack.

You can ignore this suggestion, since it occurred to me after misinterpreting the PEP.

msg235648 - (view)

Author: Neil Girdhar (NeilGirdhar) *

Date: 2015-02-09 23:41

Removed incidental PEP 7 changes and reran tests.

msg236557 - (view)

Author: Guido van Rossum (gvanrossum) * (Python committer)

Date: 2015-02-25 00:05

The patch no longer applies cleanly. I had to do "hg up -r ac0d6c09457e" to get a checkpoint to which it applies. (But I'm not sure at what point that landed me.)

msg236702 - (view)

Author: Neil Girdhar (NeilGirdhar) *

Date: 2015-02-26 20:04

New changelist for updated patch (before merging changes).

msg236703 - (view)

Author: Neil Girdhar (NeilGirdhar) *

Date: 2015-02-26 20:22

Finished merge.

msg237257 - (view)

Author: Neil Girdhar (NeilGirdhar) *

Date: 2015-03-05 11:00

Removed dead code. Awaiting code review! :)

msg237688 - (view)

Author: Ethan Furman (ethan.furman) * (Python committer)

Date: 2015-03-09 18:29

All tests pass on my ubuntu 13.04 system.

msg238751 - (view)

Author: Neil Girdhar (NeilGirdhar) *

Date: 2015-03-21 02:56

Removed a confusing and outdated comment in Lib/importlib/_bootstrap.py

msg241113 - (view)

Author: Steve Dower (steve.dower) * (Python committer)

Date: 2015-04-15 15:18

I have limited expertise in most of these areas, but I looked at starunpack40.diff and have these comments:

tests look to have good coverage of the feature (can't speak to coverage of the parser/compiler code)
parsermodule.c changes comprehension handling, but I thought we pulled this out?
why was dictobject.c.h added? I don't understand clinic thoroughly, but it seems a lot of new code for what was changed in dictobject.c
can the BUILD_(TUPLE|LIST)_UNPACK code in ceval.c share most of the processing? Or is there an unwanted perf impact to that?
whoever applies the patch should regenerate importlib.h themselves - just a reminder

Otherwise, I didn't see anything that particularly scared me. Since we apparently don't have anyone willing and with the expertise to thoroughly check the patch, I'd vote for checking it in asap so it has more releases to bake before 3.5 final.

msg242210 - (view)

Author: Neil Girdhar (NeilGirdhar) *

Date: 2015-04-29 02:51

Hi Steve:

I have limited expertise in most of these areas, but I looked at starunpack40.diff and have these comments:

tests look to have good coverage of the feature (can't speak to coverage of the parser/compiler code)
parsermodule.c changes comprehension handling, but I thought we pulled this out?

There was some code taken common in various places, but there should be no code for unpacking comprehensions left in. Do you have a question about a specific line?

why was dictobject.c.h added? I don't understand clinic thoroughly, but it seems a lot of new code for what was changed in dictobject.c

They weren't added. They were moved by someone else. The only changes are exposing a method.

can the BUILD_(TUPLE|LIST)_UNPACK code in ceval.c share most of the processing? Or is there an unwanted perf impact to that?

Good idea. I'll make that change.

whoever applies the patch should regenerate importlib.h themselves - just a reminder

Thanks!

msg242212 - (view)

Author: Neil Girdhar (NeilGirdhar) *

Date: 2015-04-29 03:16

All tests pass. All reviewer comments addressed. Please let me know if anything else needs to be done from my end.

msg242438 - (view)

Author: Thomas Wouters (twouters) * (Python committer)

Date: 2015-05-02 22:22

The latest patch looks good to me. No need to do the additional AST refactoring if it's going to make PEP 492's implementor's life harder (but I do read Guido's comment as a reason to check this in sooner rather than later :>) So, unless anyone objects I'll check in the latest patch on Monday.

msg242442 - (view)

Author: Guido van Rossum (gvanrossum) * (Python committer)

Date: 2015-05-03 00:11

Thanks for the review Thomas! And yes, that's what I meant. :-)

msg242446 - (view)

Author: Benjamin Peterson (benjamin.peterson) * (Python committer)

Date: 2015-05-03 01:52

It certainly would be nice to have documentation.

msg242624 - (view)

Author: Guido van Rossum (gvanrossum) * (Python committer)

Date: 2015-05-05 23:48

Yeah, but the docs don't need to be committed in time for beta 1. The source code should go in ASAP, especially since the PEP 492 changes will have to be merged in on top of them. @Thomas: which Monday were you shooting for? I had hoped yesterday...

On Sat, May 2, 2015 at 6:52 PM, Benjamin Peterson <report@bugs.python.org> wrote:

Benjamin Peterson added the comment:

It certainly would be nice to have documentation.

Python tracker <report@bugs.python.org> <http://bugs.python.org/issue2292>

msg242625 - (view)

Author: Guido van Rossum (gvanrossum) * (Python committer)

Date: 2015-05-05 23:48

(To clarify, the PEP itself probably serves as enough documentation in the interim.)

On Tue, May 5, 2015 at 4:47 PM, Guido van Rossum <guido@python.org> wrote:

Yeah, but the docs don't need to be committed in time for beta 1. The source code should go in ASAP, especially since the PEP 492 changes will have to be merged in on top of them. @Thomas: which Monday were you shooting for? I had hoped yesterday...

On Sat, May 2, 2015 at 6:52 PM, Benjamin Peterson <report@bugs.python.org> wrote:

Benjamin Peterson added the comment:

It certainly would be nice to have documentation.

Python tracker <report@bugs.python.org> <http://bugs.python.org/issue2292>

-- --Guido van Rossum (python.org/~guido)

msg242626 - (view)

Author: Benjamin Peterson (benjamin.peterson) * (Python committer)

Date: 2015-05-05 23:50

On Tue, May 5, 2015, at 19:48, Guido van Rossum wrote:

Guido van Rossum added the comment:

Yeah, but the docs don't need to be committed in time for beta 1. The source code should go in ASAP, especially since the PEP 492 changes will have to be merged in on top of them. @Thomas: which Monday were you shooting for? I had hoped yesterday...

I suppose I can just do it.

msg242629 - (view)

Author: Benjamin Peterson (benjamin.peterson) * (Python committer)

Date: 2015-05-06 00:17

a65f685ba8c0

msg242631 - (view)

Author: Guido van Rossum (gvanrossum) * (Python committer)

Date: 2015-05-06 00:26

Thanks Benjamin! On May 5, 2015 5:17 PM, "Benjamin Peterson" <report@bugs.python.org> wrote:

Benjamin Peterson added the comment:

a65f685ba8c0

resolution: -> fixed status: open -> closed

Python tracker <report@bugs.python.org> <http://bugs.python.org/issue2292>

msg242666 - (view)

Author: Thomas Wouters (twouters) * (Python committer)

Date: 2015-05-06 13:15

FYI, I meant last Monday, but I forgot it was May 4th (Dutch Memorial day) and May 5th (Dutch Liberation day), so that got in the way :P

Should we keep this bug open for docs changes, or is there a separate issue for that?

msg242669 - (view)

Author: Benjamin Peterson (benjamin.peterson) * (Python committer)

Date: 2015-05-06 13:20

On Wed, May 6, 2015, at 09:15, Thomas Wouters wrote:

Thomas Wouters added the comment:

FYI, I meant last Monday, but I forgot it was May 4th (Dutch Memorial day) and May 5th (Dutch Liberation day), so that got in the way :P

Should we keep this bug open for docs changes, or is there a separate issue for that?

#24136

msg242721 - (view)

Author: Stefan Behnel (scoder) * (Python committer)

Date: 2015-05-07 18:34

I get a test failure in Cython's compatibility tests which seems to be attributable to this change:

"""
>>> def sideeffect(x):
...     L.append(x)
...     return x
>>> def unhashable(x):
...     L.append(x)
...     return [x]

>>> L = []
>>> {1:2, sideeffect(2): 3, 3: 4, unhashable(4): 5, sideeffect(5): 6}    # doctest: +ELLIPSIS
Traceback (most recent call last):
TypeError: ...unhashable...
>>> L
[2, 4]
"""

Instead, L ends up being [2, 4, 5]. Is this intended? Or acceptable?

msg242723 - (view)

Author: Joshua Landau (Joshua.Landau) *

Date: 2015-05-07 18:40

There is a change as part of this to make dict building more like list and set building, which both have this behaviour.

The same changes have likely occurred before whenever BUILD_LIST and BUILD_SET were introduced, and this behaviour seems particularly undefined.

That said, I did overlook the difference. Hopefully there's agreement that it doesn't matter.

msg242724 - (view)

Author: Guido van Rossum (gvanrossum) * (Python committer)

Date: 2015-05-07 18:43

I think it's fine. It collects all the keys and values and then calls BUILD_MAP (a new opcode), rather than calling STORE_MAP for each key/value pair. I think this is a reasonable strategy for compiling a dict display.

On Thu, May 7, 2015 at 11:40 AM, Joshua Landau <report@bugs.python.org> wrote:

Joshua Landau added the comment:

There is a change as part of this to make dict building more like list and set building, which both have this behaviour.

The same changes have likely occurred before whenever BUILD_LIST and BUILD_SET were introduced, and this behaviour seems particularly undefined.

That said, I did overlook the difference. Hopefully there's agreement that it doesn't matter.

Python tracker <report@bugs.python.org> <http://bugs.python.org/issue2292>

msg242739 - (view)

Author: Alyssa Coghlan (ncoghlan) * (Python committer)

Date: 2015-05-07 23:04

This could likely stand to be clarified in the language reference, though (as well as in the 3.5 porting notes)

msg248019 - (view)

Author: Yury Selivanov (yselivanov) * (Python committer)

Date: 2015-08-05 04:51

Do we need to make lib2to3 compatible with the new grammar?

msg248025 - (view)

Author: Guido van Rossum (gvanrossum) * (Python committer)

Date: 2015-08-05 10:03

Yes we should. I'd consider it a bug if it wasn't supported in 3.5.0 and we could fix that bug in 3.5.1.

msg257139 - (view)

Author: Ezio Melotti (ezio.melotti) * (Python committer)

Date: 2015-12-28 22:17

I created #25969 to track the lib2to3 update.

msg261617 - (view)

Author: Terry J. Reedy (terry.reedy) * (Python committer)

Date: 2016-03-11 23:42

the docs don't need to be committed in time for beta 1.

10 months and 2 releases after the code patch, the doc patches in #24136 are incomplete (I believe), uncommitted, untouched since July, and forgotten about. The issue needs more help.

msg287606 - (view)

Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer)

Date: 2017-02-11 15:40

Issue26213 was opened for documenting bytecode changes. But 21 months and 3 releases after the code patch the documentation is still not updated.