[WIP] gh-129813, PEP 782: Add PyBytesWriter C API by vstinner · Pull Request #131681 · python/cpython (original) (raw)

Microbenchmark on PyBytes_FromFormat() and PyBytes_DecodeEscape() functions.

import pyperf runner = pyperf.Runner()

import ctypes from ctypes import pythonapi, py_object from ctypes import ( c_int, c_uint, c_long, c_ulong, c_size_t, c_ssize_t, c_char_p)

PyBytes_FromFormat = pythonapi.PyBytes_FromFormat PyBytes_FromFormat.argtypes = (c_char_p,) PyBytes_FromFormat.restype = py_object

PyBytes_DecodeEscape = pythonapi.PyBytes_DecodeEscape PyBytes_DecodeEscape.argtypes = (c_char_p, c_size_t, c_char_p, c_size_t, c_char_p) PyBytes_DecodeEscape.restype = py_object

runner.bench_func('Format hello world', PyBytes_FromFormat, b'Hello %s !', b'world') fmt = (b'Hell%c' + b' ' * 1024 + b' %s') runner.bench_func('Format long format', PyBytes_FromFormat, fmt, c_int(ord('o')), b'world')

s = b'abc\ndef\x40.' runner.bench_func('Decode simple', PyBytes_DecodeEscape, s, len(s), None, 0, b'unused') s = b'x' * 1024 runner.bench_func('Decode long copy', PyBytes_DecodeEscape, s, len(s), None, 0, b'unused') s = b'\x40' * 1024 runner.bench_func('Decode long \x40', PyBytes_DecodeEscape, s, len(s), None, 0, b'unused')

Results:

Benchmark ref pep782
Format long format 1.06 us 1.04 us: 1.02x faster
Decode simple 776 ns 743 ns: 1.04x faster
Decode long copy 1.38 us 1.34 us: 1.03x faster
Decode long \x40 2.70 us 2.67 us: 1.01x faster
Geometric mean (ref) 1.02x faster

Benchmark hidden because not significant (1): Format hello world

I'm not sure why PEP 782 is faster, but at least it's not slower :-)

I build Python with gcc -O3 (without PGO, LTO, CPU isolation).