[Python-Dev] io.BytesIO slower than monkey-patching io.RawIOBase (original) (raw)

Nick Coghlan ncoghlan at gmail.com
Tue Jul 17 07:48:44 CEST 2012

Previous message: [Python-Dev] io.BytesIO slower than monkey-patching io.RawIOBase
Next message: [Python-Dev] io.BytesIO slower than monkey-patching io.RawIOBase
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Tue, Jul 17, 2012 at 2:57 PM, John O'Connor <jxo6948 at rit.edu> wrote:

The second approach is consistently 10-20% faster than the first one (depending on input) for trunk Python 3.3 I think the difference is that StringIO spends extra time reallocating memory during the write loop as it grows, whereas bytes.join computes the allocation size first since it already knows the final length.

BytesIO is actually missing an optimisation that is already used in StringIO: the StringIO C implementation uses a fragment accumulator internally, and collapses that into a single string object when getvalue() is called. BytesIO is still using the old "resize-the-buffer-as-you-go" strategy, and thus ends up repeatedly reallocating the buffer as the data sequence grows incrementally.

It should be optimised to work the same way StringIO does (which is effectively the same way that the monkeypatched version works)

Cheers, Nick.

-- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia

Previous message: [Python-Dev] io.BytesIO slower than monkey-patching io.RawIOBase
Next message: [Python-Dev] io.BytesIO slower than monkey-patching io.RawIOBase
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the Python-Dev mailing list