Issue 4264: Patch: optimize code to use LIST_APPEND instead of calling list.append (original) (raw)

Created on 2008-11-05 19:30 by novalis_dt, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
tlee-ast-optimize-appends.diff novalis_dt,2008-11-05 19:30 review
tlee-ast-optimize-appends-2.diff novalis_dt,2008-11-06 22:29 Revised
tlee-ast-optimize-appends-3.diff novalis_dt,2008-11-06 23:27
Messages (10)
msg75525 - (view) Author: David Turner (novalis_dt) Date: 2008-11-05 19:30
This is a patch to Tom Lee's AST optimization branch. Python bytecode includes at least one operation which is not directly accessible from Python code: LIST_APPEND, which pops a value and a list off the stack and appends the value to the list. This is used in generating code for list comprehensions: [x for x in somelist] generates the following code for the body of the loop: ... LOAD_FAST 1#a local is generated to hold the new list LOAD_FAST 2 #the iteration variable, x LIST_APPEND ... Whereas if you were to try to write the comprehension directly, you would get: LOAD_FAST 1 LOAD_ATTR 0 #an index into the constant table: “append” LOAD_FAST 2 CALL_FUNCTION 1 #remove x and the append function from the top of the stack and push the result of the call POP_TOP This is much slower. In part, it’s the cost of doing the attribute lookup each time, which is why it’s common to see code like a = [] ap = a.append for x in …..: ap(x + …) return a But the function call is naturally slower than the simpler LIST_APPEND operation. The attached patch tries to determine statically if a local is all circumstances holds a list, and if so, generates LIST_APPEND whenever user code would call local.append. It’s not perfect — in particular, I could track local types on a more fine-grained level than per-function. But it’s a start.
msg75547 - (view) Author: Neal Norwitz (nnorwitz) * (Python committer) Date: 2008-11-06 07:45
Interesting approach. I was surprised to see the change to the AST, but I understand why you did it. I think if the AST optimization approach works out well, we will want to have some more general mechanism to communicate these optimization (or other!) hints to the compiler. It would suck to have to modify the AST each time we wanted to add a new optimization. You and Tom might have better ideas for how best to achieve that. I'll make some comments that would need to be addressed, but you might want to wait making any changes depending on what approach you decide to take. The most important missing piece is tests to show that the new code works. There are various whitespace issues. Both whitespace only changes that should be avoided and indentation inconsistencies with the existing code (or so it appears). The latter could be the way I'm viewing the file or existing problems with tabs. In Python/optimize.c, you need to handle the case where the PyDict_New() calls fail. It looks like currently an undetected error can happen in during construction. And on destruction it will crash because the objects will be NULL when calling Py_DECREF. All calls like PyDict_SetItem(), PyInt_FromLong(), etc need to handle errors. I'll need to study the code a lot more to see how well it behaves. Tests would help a lot with that.
msg75584 - (view) Author: David Turner (novalis_dt) Date: 2008-11-06 22:29
Neal Norwitz <nnorwitz@gmail.com> added the comment: > > Interesting approach. I was surprised to see the change to the AST, > but I understand why you did it. I think if the AST optimization > approach works out well, we will want to have some more general > mechanism to communicate these optimization (or other!) hints to the > compiler. It would suck to have to modify the AST each time we > wanted to add a new optimization. You and Tom might have better > ideas for how best to achieve that. I really didn't want to have to change the AST. The problem was that there was a feature of the Python bytecode which was not representable in Python source code. I don't anticipate future optimizations requiring changes to the AST, because I now believe every important feature of the bytecode is representable in the AST. The one exception might be if we have a need for arbitrary jumps. I don't think that would be useful, but it might conceivably. In that case, we would need Goto and Label AST nodes. The thing that struck me as ugly was the idea that there's one object (the optimizer) that knows everything about all optimization operations. This seems like a great case for the visitor pattern. The optimizer ought to have a list of functions to call for each type of AST node, each with some private storage space. But I didn't want to make that kind of dramatic change without consulting with Tom. > I'll make some comments that would need to be addressed, but you might > want to wait making any changes depending on what approach you decide > to take. > > The most important missing piece is tests to show that the new code > works. > There are various whitespace issues. Both whitespace only changes > that should be avoided and indentation inconsistencies with the > existing code (or so it appears). The latter could be the way I'm > viewing the file or existing problems with tabs. Fixed, I think. The original code appears to be somewhat inconsistent between files in its uses of spaces/tabs, but I think I have now matched the style of each file. > In Python/optimize.c, you need to handle the case where the > PyDict_New() calls fail. It looks like currently an undetected error > can happen during construction. And on destruction it will crash > because the objects will be NULL when calling Py_DECREF. Fixed. > All calls like PyDict_SetItem(), PyInt_FromLong(), etc need to handle > errors. I'm not exactly sure which "etc" need to handle errors. Py*_As*? Anyway, I changed the ones you mentioned. > I'll need to study the code a lot more to see how well it behaves. > Tests would help a lot with that. I've added a few.
msg75585 - (view) Author: David Turner (novalis_dt) Date: 2008-11-06 23:27
Actually, I just noticed a case where the code would do the wrong thing. I fixed it and added a test for it.
msg75587 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2008-11-07 00:28
> I really didn't want to have to change the AST. The problem was that > there was a feature of the Python bytecode which was not representable > in Python source code. FWIW, I see exposing bytecodes as an anti-pattern that locks in a particular implementation in a way that makes it hard to change and hard to port to other Python implementations. The current bound method approach works fine. Don't really see enough payoff to justify the extra code and maintenance.
msg75598 - (view) Author: Thomas Lee (thomaslee) (Python committer) Date: 2008-11-07 07:34
Neal said: > I was surprised to see the change to the AST, but I understand why you did it. The problems David ran into here sound like an argument for arbitrary AST annotations -- an idea that I was toying with around the time "Const" was introduced. That way the optimizer could provide "hints" rather than explicitly manipulating the AST in certain cases. The compiler could then look for those hints and generate code accordingly. For example, in place of the Const node, we might have a "const" annotation that's applied to the top-level expression. I think I went with the Const node because it was less work, but I don't remember the details. I'll try to dig up the appropriate emails if I haven't lost them. David said: > Fixed, I think. The original code appears to be somewhat > inconsistent between files in its uses of spaces/tabs, ... Yes, they are inconsistent with tabs/spaces. And it sucks. :) > The thing that struck me as ugly was the idea that there's one > object (the optimizer) that knows everything about all > optimization operations. That's right, but this is no different from the compiler. Conversely, a visitor pattern would probably work for both the optimizer and the compiler. Having said that, I couldn't justify making the AST optimizer a huge departure from the rest of the code base for the sake of design purity. I'm not even really sure that it's "design purity" when you do things inconsistently from one component to the next. Obviously if there's another sufficiently good argument for the visitor approach, I'm all ears. But again, if we do that I think we should do it for the compiler as well. I'm not sure how much support such a change would have.
msg75605 - (view) Author: David Turner (novalis_dt) Date: 2008-11-07 16:25
> FWIW, I see exposing bytecodes as an anti-pattern that locks in a > particular implementation in a way that makes it hard to change and > hard to port to other Python implementations. The current bound > method approach works fine. Don't really see enough payoff to > justify the extra code and maintenance. Bytecodes aren't exposed at the language level -- just during compilation. As far as I know, other implementations don't use any of the Python compiler mechanics, so there's no portability problem. If another compiler did use the Python AST, they could just compile the Append node kind to a method call with no loss of speed or clarity. Or they could just not use this optimization, if we refactor the ast optimizer as I propose to use the strategy pattern. As for the payoff, microbenchmarks show a big win. Also, appending to a list is a fairly common pattern. Our Plone-based codebase shows tens of thousands of instances of append.
msg75607 - (view) Author: David Turner (novalis_dt) Date: 2008-11-07 17:25
> Obviously if there's another sufficiently good argument for the visitor > approach, I'm all ears. But again, if we do that I think we should do > it for the compiler as well. I'm not sure how much support such a > change would have. The argument is that it would be possible to enable or disable individual optimizations this way. For the compiler, there's no need for this, because there's only one thing to do per node type (although I suppose we could just pass that set of things into the node walker). Another argument against is that it would be harder to combine optimizations when that's relevant. I don't think it's worth worrying about until there are a dozen or so AST-level optimizations.
msg224187 - (view) Author: Mark Lawrence (BreamoreBoy) * Date: 2014-07-28 21:05
Is this ever likely to be implemented or is it more likely to be rejected owing to the reasons given in ?
msg369169 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2020-05-18 03:44
The advent of the LOAD_METHOD opcode addresses this issue.
History
Date User Action Args
2022-04-11 14:56:40 admin set github: 48514
2020-05-18 03:44:50 rhettinger set status: open -> closedresolution: out of datemessages: + stage: resolved
2020-05-17 12:48:00 Yonatan Goldschmidt set nosy: + Yonatan Goldschmidt
2019-04-26 20:07:36 BreamoreBoy set nosy: - BreamoreBoy
2014-07-28 21:05:31 BreamoreBoy set nosy: + BreamoreBoymessages: + versions: + Python 3.5, - Python 3.3
2010-12-22 03:03:15 r.david.murray set nosy:nnorwitz, rhettinger, thomaslee, novalis_dttype: enhancementversions: + Python 3.3
2008-11-07 17:25:28 novalis_dt set messages: +
2008-11-07 16:25:10 novalis_dt set messages: +
2008-11-07 07:34:52 thomaslee set nosy: + thomasleemessages: +
2008-11-07 00:28:42 rhettinger set nosy: + rhettingermessages: +
2008-11-06 23:27:59 novalis_dt set files: + tlee-ast-optimize-appends-3.diffmessages: +
2008-11-06 22:29:42 novalis_dt set files: + tlee-ast-optimize-appends-2.diffmessages: +
2008-11-06 07:45:51 nnorwitz set nosy: + nnorwitzmessages: +
2008-11-05 19:30:33 novalis_dt create