DataFrame.itertuples() incorrectly determines when plain tuples should be used · Issue #28282 · pandas-dev/pandas (original) (raw)

Code Sample, a copy-pastable example if possible

import pandas, sys sys.version '3.6.7 (default, Oct 25 2018, 09:16:13) \n[GCC 5.4.0 20160609]' pandas.version '0.25.1' df = pandas.DataFrame([{f"foo_{i}": f"bar_{i}" for i in range(255)}]) df.itertuples(index=False) ... SyntaxError: more than 255 arguments

The issue seems to have been caused/revealed by this commit that removed the try-catch block around the namedtuple class creation.

FWIW, this issue is not reproducible in version 0.24.2, and is also not a problem in Python 3.7+, as the limit of the max number of arguments that can be passed to a function has been removed (AFAIK).

Problem description

The condition in itertuples() method does not correctly determine when plain tuples should be used instead of named tuples.

This how the named tuple class template defines the __new__() method (in Python 3.6 at least):

""" ... def new(_cls, {arg_list}): ... """

If there are 255 column names given, the total number of arguments to __new__() will be 256, because of that extra cls, causing a syntax error.