[Python-Dev] PEP 467: Minor API improvements to bytes, bytearray, and memoryview (original) (raw)

Paul Sokolovsky pmiscml at gmail.com
Wed Jun 8 07:26:45 EDT 2016


Hello,

On Wed, 8 Jun 2016 14:05:19 +0300 Serhiy Storchaka <storchaka at gmail.com> wrote:

On 08.06.16 13:37, Paul Sokolovsky wrote: >> The obvious way to create the bytes object of length n is b'\0' * >> n. > > That's very inefficient: it requires allocating useless b'\0', then > a generic function to repeat arbitrary memory block N times. If > there's a talk of Python to not be laughed at for being SLOW, there > would rather be efficient ways to deal with blocks of binary data.

Do you have any evidences for this claim?

Yes, it's written above, let me repeat it: bytes(n) is (can be) calloc(1, n) underlyingly, while b"\0" * n is a more complex algorithm.

$ ./python -m timeit -s 'n = 10000' -- 'bytes(n)' 1000000 loops, best of 3: 1.32 usec per loop $ ./python -m timeit -s 'n = 10000' -- 'b"\0" * n' 1000000 loops, best of 3: 0.858 usec per loop

I don't know how inefficient CPython's bytes(n) or how efficient repetition (maybe 1-byte repetitions are optimized into memset()?), but MicroPython (where bytes(n) is truly calloc(n)) gives expected results:

$ ./run-bench-tests bench/bytealloc* bench/bytealloc: 3.333s (+00.00%) bench/bytealloc-1-bytes_n.py 11.244s (+237.35%) bench/bytealloc-2-repeat.py

-- Best regards, Paul mailto:pmiscml at gmail.com



More information about the Python-Dev mailing list