[Python-Dev] PEP 467: Minor API improvements to bytes, bytearray, and memoryview (original) (raw)
Paul Sokolovsky pmiscml at gmail.com
Wed Jun 8 07:26:45 EDT 2016
- Previous message (by thread): [Python-Dev] PEP 467: Minor API improvements to bytes, bytearray, and memoryview
- Next message (by thread): [Python-Dev] PEP 467: Minor API improvements to bytes, bytearray, and memoryview
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hello,
On Wed, 8 Jun 2016 14:05:19 +0300 Serhiy Storchaka <storchaka at gmail.com> wrote:
On 08.06.16 13:37, Paul Sokolovsky wrote: >> The obvious way to create the bytes object of length n is b'\0' * >> n. > > That's very inefficient: it requires allocating useless b'\0', then > a generic function to repeat arbitrary memory block N times. If > there's a talk of Python to not be laughed at for being SLOW, there > would rather be efficient ways to deal with blocks of binary data.
Do you have any evidences for this claim?
Yes, it's written above, let me repeat it: bytes(n) is (can be) calloc(1, n) underlyingly, while b"\0" * n is a more complex algorithm.
$ ./python -m timeit -s 'n = 10000' -- 'bytes(n)' 1000000 loops, best of 3: 1.32 usec per loop $ ./python -m timeit -s 'n = 10000' -- 'b"\0" * n' 1000000 loops, best of 3: 0.858 usec per loop
I don't know how inefficient CPython's bytes(n) or how efficient repetition (maybe 1-byte repetitions are optimized into memset()?), but MicroPython (where bytes(n) is truly calloc(n)) gives expected results:
$ ./run-bench-tests bench/bytealloc* bench/bytealloc: 3.333s (+00.00%) bench/bytealloc-1-bytes_n.py 11.244s (+237.35%) bench/bytealloc-2-repeat.py
-- Best regards, Paul mailto:pmiscml at gmail.com
- Previous message (by thread): [Python-Dev] PEP 467: Minor API improvements to bytes, bytearray, and memoryview
- Next message (by thread): [Python-Dev] PEP 467: Minor API improvements to bytes, bytearray, and memoryview
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]