msg265852 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2016-05-19 12:02 |
According to Linux perf, the unpickle_list benchmark (of the CPython benchmark suite) heavily depends on the performance of load() and _Unpickler_Read() functions. While running benchmarks with PGO+LTO compilation, I noticed a difference around 5% because one build uses constant propation on _Unpickler_Read() (fast), wheras another build doesn't (slow). Depending on the result of the PGO, depending on the OS and the compiler, you may or may not get this nice optimization. I propose to implement it manually to not depend on the compiler. Attached short patch implements manually the optimization. It looks to have a big impact on unpickle_list, but no impact (benchmark is not significant) on fastunpickle and pickle_list: --- $ taskset -c 3 python3 perf.py ../ref_default/rel/python ../default/rel/python -b unpickle_list,fastunpickle,pickle_list --rigorous -v (...) Report on Linux smithers 4.4.9-300.fc23.x86_64 #1 SMP Wed May 4 23:56:27 UTC 2016 x86_64 x86_64 Total CPU cores: 4 ### fastunpickle ### Avg: 0.527359 +/- 0.005932 -> 0.518548 +/- 0.00953: 1.02x faster Not significant ### pickle_list ### Avg: 0.269307 +/- 0.017465 -> 0.266015 +/- 0.00423: 1.01x faster Not significant ### unpickle_list ### Avg: 0.255805 +/- 0.006942 -> 0.231444 +/- 0.00394: 1.11x faster Significant (t=21.58) --- It would be interesting to also evaluate the computed goto optimization for the load() function. (And also try computed goto for the re/_sre module, but that's a different issue.) I tuned my system and patched perf.py (of the CPython benchmark suite) to get stable benchmark results. |
|
|
msg265858 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2016-05-19 13:38 |
+1. Similar optimization is used in marshal. Added comments on Rietveld. |
|
|
msg265920 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2016-05-20 09:26 |
Updated patch 2 to address Serhiy's comments. |
|
|
msg265922 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2016-05-20 09:37 |
I forgot to note that the comment before _Unpickler_Read() becomes not correct, but you already addressed this in the second patch. unpickle_read-2.patch LGTM. Few months ago I tried to optimize reading in pickle, but didn't see significant benefit. After committing your patch I'm going to revive my patch. |
|
|
msg265925 - (view) |
Author: Roundup Robot (python-dev)  |
Date: 2016-05-20 09:44 |
New changeset f9b85b47f9c8 by Victor Stinner in branch 'default': Optimize pickle.load() and pickle.loads() https://hg.python.org/cpython/rev/f9b85b47f9c8 |
|
|
msg265926 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2016-05-20 09:45 |
Serhiy Storchaka: "I forgot to note that the comment before _Unpickler_Read() becomes not correct, but you already addressed this in the second patch. unpickle_read-2.patch LGTM." Cool, pushed. "Few months ago I tried to optimize reading in pickle, but didn't see significant benefit. After committing your patch I'm going to revive my patch." Cool too, keep me in touch. |
|
|
msg265946 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2016-05-20 14:07 |
I think that integer overflow in _Unpickler_Read() is possible. n is read from file and can be arbitrary (up to PY_SSIZE_T_MAX). This likely cause raising an exception later, but integer overflow itself causes undefined behavior, and we should avoid it. |
|
|
msg265952 - (view) |
Author: Roundup Robot (python-dev)  |
Date: 2016-05-20 19:17 |
New changeset 3d7b7aa89437 by Victor Stinner in branch 'default': Issue #27056: Fix _Unpickler_Read() to avoid integer overflow https://hg.python.org/cpython/rev/3d7b7aa89437 |
|
|
msg265953 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2016-05-20 19:19 |
Serhiy Storchaka: > I think that integer overflow in _Unpickler_Read() is possible. n is read from file and can be arbitrary (up to PY_SSIZE_T_MAX). This likely cause raising an exception later, but integer overflow itself causes undefined behavior, and we should avoid it. Hum, I understood that it's ok since numbers should be signed, but in fact I'm not confident that n is always signed. You are right, it's better to use your code to avoid the integer overflow. I pushed a fix. |
|
|