Issue 19171: pow() improvement on longs (original) (raw)

Created on 2013-10-05 14:07 by arigo, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
pow_speedup.diff arigo,2013-10-05 14:07 review
pow.diff tim.peters,2013-10-05 21:18 review
Messages (13)
msg198987 - (view) Author: Armin Rigo (arigo) * (Python committer) Date: 2013-10-05 14:07
The attached patch (which can be applied on both trunk and 2.7) gives a huge speed improvement for the case 'pow(huge_number, smallish_number, smallish_number)'. The improvement is unbounded: I get 20x with 'pow(x, y, z)' with the arguments 'x = 3 ** 10000, y = 10 ** 51 - 2, z = 10 ** 51' but increasing x just increases the factor. This is inspired by https://github.com/pyca/ed25519: check out revision 9f3e838d90ded42a86ec74c5e9f5e37dec8122a0, run it with 'time python -u signfast.py < sign.input'. This patch gives around 14% improvement. So it's a case that occurs in practice.
msg198991 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2013-10-05 17:02
Good idea! The patch looks almost ready to me: the comment block before the code block should be updated, since recomputing `base` is no longer being done _just_ to force `base` to a non-negative value.
msg198995 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2013-10-05 17:58
A bit of history: last time I fiddled that code, I didn't worry about this, because for large enough exponents all internal numbers _eventually_ become less than `base`. But the patch can speed up the _startup_ costs by an arbitrary amount (for smaller exponents it's _all_ "startup costs", while for larger exponents there are 31 multiplications by `base` to precompute a 5-bits-a-time table). Of course there's no problem with correctness here: `base` and `base % modulus` are equivalent in this algorithm.
msg198996 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2013-10-05 17:59
Grr: should be: "all internal numbers _eventually_ become less than `modulus`", not "less than `base`".
msg199003 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2013-10-05 21:18
New patch changes the comments to match the new code.
msg199014 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2013-10-05 22:14
New changeset f34c59494420 by Tim Peters in branch '3.3': Issue #19171: speed some cases of 3-argument long pow(). http://hg.python.org/cpython/rev/f34c59494420 New changeset 6fcdd1657ee3 by Tim Peters in branch 'default': Issue #19171: speed some cases of 3-argument long pow(). http://hg.python.org/cpython/rev/6fcdd1657ee3 New changeset 101bf827611a by Tim Peters in branch '2.7': Issue #19171: speed some cases of 3-argument long pow(). http://hg.python.org/cpython/rev/101bf827611a
msg199175 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2013-10-08 07:48
Hmm. I thought 2.7 (and 3.3, for that matter) was in bugfix mode only?
msg199176 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2013-10-08 07:51
> I thought 2.7 (and 3.3, for that matter) was in bugfix mode only? It would be crazy to not apply this little fix-up.
msg199177 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2013-10-08 08:13
> It would be crazy to not apply this little fix-up. Crazy? How so? Note that this change, while introducing a performance enhancement in some rather unlikely corner cases, also introduces a performance regression in some other unlikely corner cases: Before the patch (2.7): iwasawa:cpython mdickinson$ ./python.exe -m timeit -s "a=7**10000; b=0; c=23" "pow(a, b, c)" 1000000 loops, best of 3: 0.232 usec per loop After the patch: iwasawa:cpython mdickinson$ ./python.exe -m timeit -s "a=7**10000; b=0; c=23" "pow(a, b, c)" 100000 loops, best of 3: 13.8 usec per loop That can be easily fixed with more special-casing and more code, but I don't think this sort of experimentation is appropriate for a bugfix branch.
msg199217 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2013-10-08 16:21
I'll revert the 2.7 change if people agree that's a good thing. I'm fine with it as-is. Armin pulled the idea from timing a Python public-key crypto project (see the original message in this report), where he found a 14% improvement. I don't care if the trivial exponent == 0 case slows down - that's _truly_ unlikely ;-) The time spent special-casing it would marginally slow down other cases without good reason. For any exponent other than 0, reduction by the base must be done.
msg199230 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2013-10-08 18:30
No need to revert. The improvement seems like a good one; I was just a bit surprised to see it land in the maintenance branches as well as the default branch. My understanding was that minor performance improvements aren't normally candidates for inclusion in 2.7. Maybe Benjamin can clarify the policy here. > I don't care if the trivial exponent == 0 case slows down [...] Sure; I guess my point was that even the simplest change can have unexpected / unintended consequences, which is one of the reasons that it makes sense to me to avoid non-bugfix changes in 2.7 / 3.3. (We're not totally without use-cases for special-casing pow(a, 0, b), by the way: such a pow operation occurs any time you do `hash(Decimal(some_integer))`, though admittedly not with an oversized a.)
msg199235 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2013-10-08 18:47
I'm glad you pointed it out, Mark! You're right about unintended consequences, and I confess I didn't think at all about the exponent == 0 case. I didn't remind myself that 2.7 was a bugfix branch either: I read Armin's "(which can be applied on both trunk and 2.7)" and reflexively checked it in. I'd _say_ I'd be more careful about that in the future, but since I still use 2.7.5 for most of my private work I wouldn't believe me either ;-)
msg199240 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2013-10-08 19:45
In general, we like to touch 2.7 as little as possible. I'm not sure it's worth arguing about this (admittely small) change meets the bar.
History
Date User Action Args
2022-04-11 14:57:51 admin set github: 63370
2013-10-08 19:45:42 benjamin.peterson set messages: +
2013-10-08 18:47:28 tim.peters set messages: +
2013-10-08 18:30:39 mark.dickinson set nosy: + benjamin.petersonmessages: +
2013-10-08 16:21:00 tim.peters set messages: +
2013-10-08 08:13:42 mark.dickinson set messages: +
2013-10-08 07:51:35 rhettinger set nosy: + rhettingermessages: +
2013-10-08 07:48:34 mark.dickinson set messages: +
2013-10-05 22:16:15 tim.peters set status: open -> closedresolution: fixedstage: patch review -> resolved
2013-10-05 22:14:43 python-dev set nosy: + python-devmessages: +
2013-10-05 21🔞44 tim.peters set files: + pow.diffmessages: +
2013-10-05 17:59:47 tim.peters set messages: +
2013-10-05 17:58:43 tim.peters set messages: +
2013-10-05 17:25:09 pitrou set type: performanceversions: - Python 2.7
2013-10-05 17:02:03 tim.peters set nosy: + tim.petersmessages: + stage: patch review
2013-10-05 15:58:46 benjamin.peterson set nosy: + mark.dickinson
2013-10-05 14:07:17 arigo create