[Python-Dev] Inplace operations for PyLong objects (original) (raw)
Terry Reedy tjreedy at udel.edu
Thu Aug 31 17:24:50 EDT 2017
- Previous message (by thread): [Python-Dev] Inplace operations for PyLong objects
- Next message (by thread): [Python-Dev] Inplace operations for PyLong objects
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On 8/31/2017 2:40 PM, Manciu, Catalin Gabriel wrote:
Hi everyone,
While looking over the PyLong source code in Objects/longobject.c I came across the fact that the PyLong object doesnt't include implementation for basic inplace operations such as adding or multiplication: [...] longlong, /nbint/ 0, /nbreserved/ longfloat, /nbfloat/ 0, /* nbinplaceadd */ 0, /* nbinplacesubtract */ 0, /* nbinplacemultiply */ 0, /* nbinplaceremainder */ [...] While I understand that the immutable nature of this type of object justifies this approach, I wanted to experiment and see how much performance an inplace add would bring. My inplace add will revert to calling the default longadd function when: - the refcount of the first operand indicates that it's being shared or - that operand is one of the preallocated 'small ints' which should mitigate the effects of not conforming to the PyLong immutability specification. It also allocates a new PyLong only in case of a potential overflow. The workload I used to evaluate this is a simple script that does a lot of inplace adding: import time import sys def writeprogress(prevpercentage, value, limit): percentage = (100 * value) // limit if percentage != prevpercentage: sys.stdout.write("%d%%\r" % (percentage)) sys.stdout.flush() return percentage progress = -1 thevalue = 0 theincrement = ((1 << 30) - 1) crtiter = 0 totaliters = 10 ** 9 start = time.time() while crtiter < totaliters: thevalue += theincrement crtiter += 1 progress = writeprogress(progress, crtiter, totaliters) end = time.time() print ("\n%.3fs" % (end - start)) print ("thevalue: %d" % (thevalue)) Running the baseline version outputs: ./python inplace.py 100% 356.633s thevalue: 1073741823000000000 Running the modified version outputs: ./python inplace.py 100% 308.606s thevalue: 1073741823000000000 In summary, I got a +13.47% improvement for the modified version. The CPython revision I'm using is 7f066844a79ea201a28b9555baf4bceded90484f from the master branch and I'm running on a I7 6700K CPU with Turbo-Boost disabled (frequency is pinned at 4GHz). Do you think that such an optimization would be a good approach ?
On my machine, the more realistic code, with an implicit C loop, the_value = sum(the_increment for i in range(total_iters)) gives the same value twice as fast as your explicit Python loop. (I cut total_iters down to 10**7).
You might check whether sum uses an in-place accumulator for ints.
-- Terry Jan Reedy
- Previous message (by thread): [Python-Dev] Inplace operations for PyLong objects
- Next message (by thread): [Python-Dev] Inplace operations for PyLong objects
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]