[Python-Dev] 64 bit units in PyLong (original) (raw)
Gregory P. Smith greg at krypto.org
Wed Jul 5 16🔞05 EDT 2017
- Previous message (by thread): [Python-Dev] 64 bit units in PyLong
- Next message (by thread): [Python-Dev] Remove own implementation for thread-local storage
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Wed, Jul 5, 2017 at 12:05 PM Mark Dickinson <dickinsm at gmail.com> wrote:
On Mon, Jul 3, 2017 at 5:52 AM, Siyuan Ren <netheril96 at gmail.com> wrote: > The current PyLong implementation represents arbitrary precision integers in > units of 15 or 30 bits. I presume the purpose is to avoid overflow in > addition , subtraction and multiplication. But compilers these days offer > intrinsics that allow one to access the overflow flag, and to obtain the > result of 64 bit multiplication as a 128 bit number. Or at least on x86-64, > which is the dominant platform. Any reason why it is not done?
Portability matters, so any use of these intrinsics would likely also have to be accompanied by fallback code that doesn't depend on them, as well as some buildsystem complexity to figure out whether those intrinsics are supported or not. And then the Objects/longobject.c would suffer in terms of simplicity and readability, so there would have to be some clear gains to offset that. Note that the typical Python workload does not involve thousand-digit integers: what would matter would be performance of smaller integers, and it seems conceivable that 64-bit limbs would speed up those operations simply because so many more integers would become single-limb and so there would be more opportunities to take fast paths, but there would need to be benchmarks demonstrating that. Oh, and you'd have to rewrite the power algorithm, which currently depends on the size of a limb in bytes being a multiple of 5. :-) -- Mark
When I pushed to get us to adopt 30-bit digits instead of 15-bit digits, I hoped it could also happen on 32-bit x86 builds as the hardware has fast 32bit multiply -> 64 result support. But it made the configuration more difficult and IIRC would've increase the memory use of the PyLong type for single digit numbers on 32-bit platforms so we settled on moving to 30 bits only for 64-bit platforms (which were obviously going to become the norm).
A reasonable digit size for hardware that supports 128bit results of 64bit multiplies would be 60 bits (to keep our multiple of 5 bits logic working). But I doubt you will see any notable performance gains in practical applications by doing so. Sure, numbers in the 1-4 billion range are slightly less efficient today, but those are not likely to appear in hot loops in a typical application. Microbenchmarks alone should not be used to make this decision.
remember: We had native integer support in Python 1 & 2 via the old PyInt type. In Python 3 we ditched that in favor of PyLong everywhere. This was a performance hit for the sake of proper high level language simplicity. We've already regained all of that since then in other areas of the interpreter.
-gps -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/python-dev/attachments/20170705/efefda72/attachment.html>
- Previous message (by thread): [Python-Dev] 64 bit units in PyLong
- Next message (by thread): [Python-Dev] Remove own implementation for thread-local storage
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]