[Python-Dev] Impact of Namedtuple on startup time (original) (raw)

Steven D'Aprano steve at pearwood.info
Mon Jul 17 12:45:20 EDT 2017


On Mon, Jul 17, 2017 at 02:43:19PM +0200, Antoine Pitrou wrote:

Hello, Cost of creating a namedtuple has been identified as a contributor to Python startup time. Not only Python core and the stdlib, but any third-party library creating namedtuple classes (there are many of them). An issue was created for this: https://bugs.python.org/issue28638

Some time ago, I needed to backport a version of namedtuple to Python 2.4, so I started with Raymond's recipe on Activestate and modified it to only exec the code needed for new. The rest of the class is an ordinary inner class:

a short sketch

def namedtuple(...): class Inner(tuple): ... exec(source, ns) Inner.new = ns['new'] return Inner

Here's my fork of Raymond's recipe:

https://code.activestate.com/recipes/578918-yet-another-namedtuple/

Out of curiosity, I took that recipe, updated it to work in Python 3, and compared it to the std lib version. Here are some representative timings:

[steve at ando ~]$ python3.5 -m timeit -s "from collections import namedtuple" "K = namedtuple('K', 'a b c')" 1000 loops, best of 3: 1.02 msec per loop

[steve at ando ~]$ python3.5 -m timeit -s "from nt3 import namedtuple" "K = namedtuple('K', 'a b c')" 1000 loops, best of 3: 255 usec per loop

I think that proves that this approach is viable and can lead to a big speed up.

I don't think that merely dropping the _source attribute will save much time. It might save a bit of memory, but in my experiements dropping it only saves about 10µs more. I think the real bottleneck is the cost of exec'ing the entire class.

-- Steve



More information about the Python-Dev mailing list