[Python-Dev] O(1) deletes from the front of bytearray (was: Re: Adding bytes.frombuffer() constructor to PEP 467 (was: [Python-ideas] Adding bytes.frombuffer() constructor) (original) (raw)
Nathaniel Smith njs at pobox.com
Wed Oct 12 02:31:04 EDT 2016
- Previous message (by thread): [Python-Dev] Adding bytes.frombuffer() constructor to PEP 467 (was: [Python-ideas] Adding bytes.frombuffer() constructor
- Next message (by thread): [Python-Dev] O(1) deletes from the front of bytearray (was: Re: Adding bytes.frombuffer() constructor to PEP 467 (was: [Python-ideas] Adding bytes.frombuffer() constructor)
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Tue, Oct 11, 2016 at 10:53 PM, Serhiy Storchaka <storchaka at gmail.com> wrote:
On 12.10.16 08:03, Nathaniel Smith wrote:
On Tue, Oct 11, 2016 at 9:08 PM, INADA Naoki <songofacandy at gmail.com> wrote:
From Python 3.4, bytearray is good solution for I/O buffer, thanks to #19087 [1]. Actually, asyncio uses bytearray as I/O buffer often. Whoa what?! This is awesome, I had no idea that bytearray had O(1) deletes at the front. I literally reimplemented this myself on type of bytearray for some 3.5-only code recently because I assumed bytearray had the same asymptotics as list, and AFAICT this is totally undocumented. Shouldn't we write this down somewhere? Maybe here? -> https://docs.python.org/3/library/functions.html#bytearray I afraid this is CPython implementation detail (like string concatenation optimization). Other implementations can have O(N) deletes at the front of bytearray.
Well, it shouldn't be :-).
The problem with the string concatenation optimization is that to work, it requires both the use of refcounting GC and that you get lucky with how the underlying malloc implementation happened to lay things out in memory. Obviously it shouldn't be part of the language spec.
But amortized O(1) deletes from the front of bytearray are totally different, and more like amortized O(1) appends to list: there are important use cases[1] that simply cannot be implemented without some feature like this, and putting the implementation inside bytearray is straightforward, deterministic, and more efficiently than hacking together something on top. Python should just guarantee it, IMO.
-n
[1] My use case is parsing HTTP out of a receive buffer. If deleting the first k bytes of an N byte buffer is O(N), then not only does parsing becomes O(N^2) in the worst case, but it's the sort of O(N^2) that random untrusted network clients can trigger at will to DoS your server.
-- Nathaniel J. Smith -- https://vorpus.org
- Previous message (by thread): [Python-Dev] Adding bytes.frombuffer() constructor to PEP 467 (was: [Python-ideas] Adding bytes.frombuffer() constructor
- Next message (by thread): [Python-Dev] O(1) deletes from the front of bytearray (was: Re: Adding bytes.frombuffer() constructor to PEP 467 (was: [Python-ideas] Adding bytes.frombuffer() constructor)
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]