Issue 33325: Optimize sequences of constants in the compiler (original) (raw)
The following PR makes three optimizations in the compiler.
- A sequence of several LOAD_CONSTs is replaced with a single LOAD_CONST followed by UNPACK_SEQUENCE.
For example, "{'a': 1, 'b': 2, 'c': 3}" is currently compiled to
1 0 LOAD_CONST 0 (1) 2 LOAD_CONST 1 (2) 4 LOAD_CONST 2 (3) 6 LOAD_CONST 3 (('a', 'b', 'c')) 8 BUILD_CONST_KEY_MAP 3 10 POP_TOP 12 LOAD_CONST 4 (None) 14 RETURN_VALUE
With this optimization it will be compiled to:
1 0 LOAD_CONST 5 ((('a', 'b', 'c'), 3, 2, 1)) 2 UNPACK_SEQUENCE 4 4 BUILD_CONST_KEY_MAP 3 6 POP_TOP 8 LOAD_CONST 4 (None) 10 RETURN_VALUE
- Optimized building lists and sets of constants. [1, 2, 3, 4, 5] will be compiled to [*(1, 2, 3, 4, 5)], and {1, 2, 3, 4, 5} will be compiled to {*frozenset(1, 2, 3, 4, 5)}, where (1, 2, 3, 4, 5) and frozenset(1, 2, 3, 4, 5) are just constants.
x = [1, 2, 3, 4, 5] y = {1, 2, 3, 4, 5}
currently is compiled to
1 0 LOAD_CONST 0 (1) 2 LOAD_CONST 1 (2) 4 LOAD_CONST 2 (3) 6 LOAD_CONST 3 (4) 8 LOAD_CONST 4 (5) 10 BUILD_LIST 5 12 STORE_NAME 0 (x)
2 14 LOAD_CONST 0 (1) 16 LOAD_CONST 1 (2) 18 LOAD_CONST 2 (3) 20 LOAD_CONST 3 (4) 22 LOAD_CONST 4 (5) 24 BUILD_SET 5 26 STORE_NAME 1 (y) 28 LOAD_CONST 5 (None) 30 RETURN_VALUE
With optimization 1 it will be compiled to
1 0 LOAD_CONST 6 ((5, 4, 3, 2, 1)) 2 UNPACK_SEQUENCE 5 4 BUILD_LIST 5 6 STORE_NAME 0 (x)
2 8 LOAD_CONST 6 ((5, 4, 3, 2, 1)) 10 UNPACK_SEQUENCE 5 12 BUILD_SET 5 14 STORE_NAME 1 (y) 16 LOAD_CONST 5 (None) 18 RETURN_VALUE
And with optimization 2 it will be compiled to
1 0 LOAD_CONST 0 ((1, 2, 3, 4, 5)) 2 BUILD_LIST_UNPACK 1 4 STORE_NAME 0 (x)
2 6 LOAD_CONST 1 (frozenset({1, 2, 3, 4, 5})) 8 BUILD_SET_UNPACK 1 10 STORE_NAME 1 (y) 12 LOAD_CONST 2 (None) 14 RETURN_VALUE
- Remove unused constants.
After folding tuples of constants created at code generation level, eliminating unreachable code, and after the above two optimizations, unused constants are left in the co_consts tuple. The third optimization removes them and reenumerate constants in the order of occurrence. The above example will be compiled to:
1 0 LOAD_CONST 0 ((1, 2, 3, 4, 5)) 2 BUILD_LIST_UNPACK 1 4 STORE_NAME 0 (x)
2 6 LOAD_CONST 1 (frozenset({1, 2, 3, 4, 5})) 8 BUILD_SET_UNPACK 1 10 STORE_NAME 1 (y) 12 LOAD_CONST 2 (None) 14 RETURN_VALUE
See for the implementation of this optimization on the level of raw bytecode (in peephole.c).
These optimizations are useful not only for initializing collections of constants. Calling function with constant arguments "f(x, a=1, b=2)":
Current: 1 0 LOAD_NAME 0 (f) 2 LOAD_NAME 1 (x) 4 LOAD_CONST 0 (None) 6 LOAD_CONST 1 (1) 8 LOAD_CONST 2 (2) 10 LOAD_CONST 3 (('a', 'b')) 12 CALL_FUNCTION_KW 4 14 POP_TOP 16 LOAD_CONST 0 (None) 18 RETURN_VALUE
Optimized: 1 0 LOAD_NAME 0 (f) 2 LOAD_NAME 1 (x) 4 LOAD_CONST 0 ((('a', 'b'), 2, 1, None)) 6 UNPACK_SEQUENCE 4 8 CALL_FUNCTION_KW 4 10 POP_TOP 12 LOAD_CONST 1 (None) 14 RETURN_VALUE
This issue depends on .