msg242017 - (view) |
Author: Raymond Hettinger (rhettinger) *  |
Date: 2015-04-25 15:35 |
Replace all the _sqrt(x) calls with x ** 0.5, improving the visual appearance and providing a modest speed improvement. $ python3 -m timeit -s 'from math import sqrt' 'sqrt(3.14)' 10000000 loops, best of 3: 0.032 usec per loop $ python3 -m timeit -s 'from math import sqrt' '3.14 ** 0.5' 100000000 loops, best of 3: 0.0101 usec per loop |
|
|
msg242018 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2015-04-25 15:47 |
sqrt(x) and x ** 0.5 can be different. >>> 0.0171889575379941**0.5 0.13110666473522276 >>> sqrt(0.0171889575379941) 0.13110666473522273 |
|
|
msg242023 - (view) |
Author: Raymond Hettinger (rhettinger) *  |
Date: 2015-04-25 21:14 |
Hmm, I don't get that same result (Mac OS/X 10.10, Python 3.4.2): >>> from math import * >>> 0.0171889575379941**0.5 0.13110666473522273 >>> sqrt(0.0171889575379941) 0.13110666473522273 It's odd because your two result as same number, just displayed differently. I thought that wasn't supposed to happen anymore. >>> (0.13110666473522276).hex() '0x1.0c81a6aa9a74ep-3' >>> (0.13110666473522273).hex() '0x1.0c81a6aa9a74dp-3' |
|
|
msg242025 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2015-04-25 21:51 |
>>> (0.0171889575379941**0.5).hex() '0x1.0c81a6aa9a74ep-3' >>> from math import * >>> sqrt(0.0171889575379941).hex() '0x1.0c81a6aa9a74dp-3' >>> 0.0171889575379941**0.5 == sqrt(0.0171889575379941) False |
|
|
msg242026 - (view) |
Author: Tim Peters (tim.peters) *  |
Date: 2015-04-25 22:04 |
FYI, my results match Serhiy's, on Windows, under Pythons 3.4.2 and 2.7.8. It's not surprising to me. Since IEEE 754 standardized sqrt, most vendors complied, delivering a square root "as if infinitely precise" with one anally correct rounding. But unless the platform pow() special-cases 0.5, that's going to involve a logarithm, multiplication, and exponentiation under the covers. pow() implementations usually fake some "extra precision" (else the worst-case errors can be horrendous), but it's still not always the same as single-rounding. Raymond, I didn't understand this part: "It's odd because your two result as same number, just displayed differently." The output immediately following that showed they _are_ different numbers on your box too (the .hex() outputs differ by one in the last place). |
|
|
msg242028 - (view) |
Author: Raymond Hettinger (rhettinger) *  |
Date: 2015-04-25 22:48 |
> The output immediately following that showed they _are_ different numbers > on your box too So it is. Seems like I need to increase my font size :-) |
|
|
msg242029 - (view) |
Author: Raymond Hettinger (rhettinger) *  |
Date: 2015-04-25 23:11 |
The next question is whether we care about that 1 ULP in the context of the random number calculations which already involve multi-step chains that aren't favorable to retaining precision: s + (1.0 + s * s) ** 0.5 ainv = (2.0 * alpha - 1.0) ** 0.5 bbb = alpha - LOG4 ccc = alpha + ainv g2rad = (-2.0 * _log(1.0 - random())) ** 0.5 z = _cos(x2pi) * g2rad self.gauss_next = _sin(x2pi) * g2rad stddev = (sqsum/n - avg*avg) ** 0.5 |
|
|
msg242045 - (view) |
Author: Tim Peters (tim.peters) *  |
Date: 2015-04-26 05:28 |
I don't care about the ULP. I don't care about exactly reproducing floating-point results across releases either, but I'd bet someone else does ;-) |
|
|
msg242057 - (view) |
Author: Raymond Hettinger (rhettinger) *  |
Date: 2015-04-26 16:47 |
Okay, I give up. |
|
|
msg242063 - (view) |
Author: Mark Dickinson (mark.dickinson) *  |
Date: 2015-04-26 17:26 |
Gah! Peephole optimizer! When you do a timeit for '3.14 ** 0.5', you're just evaluating the time to retrieve a constant! In general, `**` is going to be both slower *and* less accurate than math.sqrt. Please don't make this change! |
|
|
msg242065 - (view) |
Author: Mark Dickinson (mark.dickinson) *  |
Date: 2015-04-26 17:29 |
Timing results on my machine: (Canopy 64bit) taniyama:~ mdickinson$ python3 -m timeit -s "from math import sqrt; x = 3.14" "sqrt(x)" 10000000 loops, best of 3: 0.0426 usec per loop (Canopy 64bit) taniyama:~ mdickinson$ python3 -m timeit -s "from math import sqrt; x = 3.14" "x**0.5" 10000000 loops, best of 3: 0.0673 usec per loop And the disassembly showing the peephole optimizer at work: >>> def f(): return 3.14**0.5 ... >>> import dis >>> dis.dis(f) 1 0 LOAD_CONST 3 (1.772004514666935) 3 RETURN_VALUE |
|
|
msg242066 - (view) |
Author: Mark Dickinson (mark.dickinson) *  |
Date: 2015-04-26 17:31 |
Ah, I missed that the issue was already closed. Apologies for the excitement and the gratuitous exclamation marks in my previous messages. |
|
|
msg242067 - (view) |
Author: Tim Peters (tim.peters) *  |
Date: 2015-04-26 17:33 |
Good catch, Mark! |
|
|