Issue 16001: small ints: cache string representation (original) (raw)

Created on 2012-09-22 00:11 by vstinner, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Messages (9)

msg170938 - (view)

Author: STINNER Victor (vstinner) * (Python committer)

Date: 2012-09-22 00:11

Integers in range [-5; 256] are singleton. It is possible to cache their string representation (in base 10) to make str(int) faster, but also to reduce memory consumption (and memory fragmentation, Unicode strings don't use free list).

Attached patch implements this micro-optimization. It reuses singleton for latin1 characters (so digits 0..9): str(0) is chr(48).

This change is required because Py_TYPE(val)->tp_str(val); may return a singleton, whereas formatlong() requires a "mutable" string (string not used yet).

See also issue #10044.

msg170943 - (view)

Author: Antoine Pitrou (pitrou) * (Python committer)

Date: 2012-09-22 00:46

Honestly, I'm not sure I understand the point of this optimization. Have you identified a workload where this would help?

msg170947 - (view)

Author: Martin v. Löwis (loewis) * (Python committer)

Date: 2012-09-22 01:13

Also, can you report benchmark results?

msg171202 - (view)

Author: STINNER Victor (vstinner) * (Python committer)

Date: 2012-09-24 23:25

Here is a micro benchmark:

run it using:

benchmark.py script bench_int_str.py [--file=output]

https://bitbucket.org/haypo/misc/src/tip/python/benchmark.py

def run_benchmark(bench): bench.timeit('S(123)', setup='S=str') bench.timeit('S(1) == S(2)', setup='S=str') bench.timeit('S(12345)', setup='S=str') bench.timeit('{S(x): x for x in data}', setup='data=tuple(range(100)); S=str') bench.timeit('"x=%s" % x', setup='x=123') bench.timeit('"x=%s" % x', setup='x=12345')

Output: -------------------------------------------------------+-------------+--------------- Tests | unpatched | patched -------------------------------------------------------+-------------+--------------- S=str; S(123) | 158 ns () | 112 ns (-29%) S=str; S(1) == S(2) | 329 ns () | 248 ns (-25%) S=str; S(12345) | 161 ns () | 161 ns data=tuple(range(100)); S=str; {S(x): x for x in data} | 23 us () | 16.5 us (-28%) x=123; "x=%s" % x | 145 ns () | 133 ns (-8%) x=12345; "x=%s" % x | 149 ns () | 145 ns -------------------------------------------------------+-------------+--------------- Total | 23.9 us (*) | 17.3 us (-27%) -------------------------------------------------------+-------------+---------------

I expected more important speedup.

msg171203 - (view)

Author: STINNER Victor (vstinner) * (Python committer)

Date: 2012-09-24 23:25

My initial idea was to cache str(int) for any integer, but it may waste memory for huge numbers.

msg171204 - (view)

Author: STINNER Victor (vstinner) * (Python committer)

Date: 2012-09-24 23:28

Oops, I forgot to attach the new patch: small_ints_cache_str-2.patch optimizes also str % args (copy the string when needed if the reference count is not 1).

msg171211 - (view)

Author: Martin v. Löwis (loewis) * (Python committer)

Date: 2012-09-25 03:12

-1 for this entire effort.

msg171321 - (view)

Author: Terry J. Reedy (terry.reedy) * (Python committer)

Date: 2012-09-25 21:58

Small int caching saves both time and space. On a nearly fresh IDLE session:

sys.getrefcount(0) 772 sys.getrefcount(1) 854 sum(sys.getrefcount(i) for i in range(-5, 257)) 4878

While an interesting idea, I do not see the same gain here, and agree with Martin.

Array lookup is faster than string conversion:

ti.repeat(setup = "ar = [str(i) for i in range(101)]", stmt = "ar[100]") [0.058166605132757995, 0.03438449234832762, 0.034402937150259674] ti.repeat(setup = "S = str", stmt = 'S(100)') [0.21833603908330446, 0.19469564386039195, 0.1947128590088596]

but 1: converting ints to decimal digits is nearly always done for output, and conversion is blazingly fast compared to output, so output time will dominate.

ti.repeat(setup = "S = str", stmt = 'S(100)', number = 20) [1.0144641009901534e-05, 8.914987631669646e-06, 8.914987574826228e-06] ti.repeat(setup = "p = print", stmt = 'p(100)', number = 20) ... [0.11873041968999587, 0.039060557051357137, 0.03859697769621562]

  1. I presume the conversion of 0 - 9 to '0' - '9' within the conversion routines is already optimized. I don't see that 10 - 259 should be more common that 257 - 999, let alone more common than all higher ints. So the limited optimization can have only limited effect.

  2. Much production numerical output is float or decimal rather than int. The 3.3 optimization of ascii-only strings to bytes helped here.

msg171575 - (view)

Author: STINNER Victor (vstinner) * (Python committer)

Date: 2012-09-29 16:14

str(int) is less common than "%s" % int or "%i" % int, and str%args has now "fast path" in Python 3.3 for integers. So I agree that my patch adds useless complexity and I'm closing this idea, thanks.

History

Date

User

Action

Args

2022-04-11 14:57:36

admin

set

github: 60205

2012-09-29 16:14:28

vstinner

set

status: open -> closed
resolution: rejected
messages: +

2012-09-28 08:51:37

jcea

set

nosy: + jcea

2012-09-25 21:58:41

terry.reedy

set

nosy: + terry.reedy
messages: +

2012-09-25 03:12:12

loewis

set

messages: +

2012-09-24 23:28:31

vstinner

set

files: + small_ints_cache_str-2.patch

messages: +

2012-09-24 23:25:56

vstinner

set

messages: +

2012-09-24 23:25:20

vstinner

set

files: + bench_int_str.py

messages: +

2012-09-22 01:13:44

loewis

set

nosy: + loewis
messages: +

2012-09-22 00:46:41

pitrou

set

nosy: + mark.dickinson
messages: +

2012-09-22 00:12:01

vstinner

set

nosy: - ezio.melotti

2012-09-22 00:11:49

vstinner

set

nosy: + serhiy.storchaka

2012-09-22 00:11:19

vstinner

create