[WIP] gh-129813, PEP 782: Add PyBytesWriter C API by vstinner · Pull Request #131681 · python/cpython (original) (raw)
Microbenchmark on PyBytes_FromFormat()
and PyBytes_DecodeEscape()
functions.
import pyperf runner = pyperf.Runner()
import ctypes from ctypes import pythonapi, py_object from ctypes import ( c_int, c_uint, c_long, c_ulong, c_size_t, c_ssize_t, c_char_p)
PyBytes_FromFormat = pythonapi.PyBytes_FromFormat PyBytes_FromFormat.argtypes = (c_char_p,) PyBytes_FromFormat.restype = py_object
PyBytes_DecodeEscape = pythonapi.PyBytes_DecodeEscape PyBytes_DecodeEscape.argtypes = (c_char_p, c_size_t, c_char_p, c_size_t, c_char_p) PyBytes_DecodeEscape.restype = py_object
runner.bench_func('Format hello world', PyBytes_FromFormat, b'Hello %s !', b'world') fmt = (b'Hell%c' + b' ' * 1024 + b' %s') runner.bench_func('Format long format', PyBytes_FromFormat, fmt, c_int(ord('o')), b'world')
s = b'abc\ndef\x40.' runner.bench_func('Decode simple', PyBytes_DecodeEscape, s, len(s), None, 0, b'unused') s = b'x' * 1024 runner.bench_func('Decode long copy', PyBytes_DecodeEscape, s, len(s), None, 0, b'unused') s = b'\x40' * 1024 runner.bench_func('Decode long \x40', PyBytes_DecodeEscape, s, len(s), None, 0, b'unused')
Results:
Benchmark | ref | pep782 |
---|---|---|
Format long format | 1.06 us | 1.04 us: 1.02x faster |
Decode simple | 776 ns | 743 ns: 1.04x faster |
Decode long copy | 1.38 us | 1.34 us: 1.03x faster |
Decode long \x40 | 2.70 us | 2.67 us: 1.01x faster |
Geometric mean | (ref) | 1.02x faster |
Benchmark hidden because not significant (1): Format hello world
I'm not sure why PEP 782 is faster, but at least it's not slower :-)
I build Python with gcc -O3
(without PGO, LTO, CPU isolation).