bpo-32494: Use gdbm_count for dbm_length if possible by corona10 · Pull Request #19814 · python/cpython (original) (raw)
By this PR, we can use gdbm_count without exporting a new public API.
I ran the benchmark and it shows noticeable performance enhancement.
This can be measured by invalidating cached value.
Benchmark 1
Run len(kv) after putting new value to invalidate the cache.
| Benchmark | bpo-32494-master | bpo-32494-proposed |
+===========+==================+==============================+
| bpo-32494 | 262 us | 42.2 us: 6.20x faster (-84%) |
+-----------+------------------+------------------------------+
import pyperf
runner = pyperf.Runner() runner.timeit(name="bpo-32494", stmt=""" ret = len(kv) kv[f'key-{ret}'] = f'value-{ret}' """ , setup = """ import dbm.gnu as gdbm from test.support import TESTFN kv = gdbm.open(TESTFN, 'c') for i in range(1000): kv[f'key-{i}'] = f'value-{i}' """ )
Benchmark2
Remove caching code path to measure without putting new key/value.
- if (dp->di_size < 0) {
- if (1) {
+-----------+--------------------+-------------------------------+
| Benchmark | bpo-32494-master-1 | bpo-32494-proposed-1 |
+===========+====================+===============================+
| bpo-32494 | 109 us | 590 ns: 185.32x faster (-99%) |
+-----------+--------------------+-------------------------------+
import pyperf
runner = pyperf.Runner() runner.timeit(name="bpo-32494", stmt=""" ret = len(kv) """ , setup = """ import dbm.gnu as gdbm from test.support import TESTFN kv = gdbm.open(TESTFN, 'c') for i in range(1000): kv[f'key-{i}'] = f'value-{i}' """ )