df.info() incorrectly reports memory use for string data? (original) (raw)
Doubling the string size does not change the reported memory use:
In [64]: d=pd.DataFrame(["%06d" %x for x in range(1000000)]) ...: d.info() <class 'pandas.core.frame.DataFrame'> Int64Index: 1000000 entries, 0 to 999999 Data columns (total 1 columns): 0 1000000 non-null object dtypes: object(1) memory usage: 15.3 MB
In [65]: d=pd.DataFrame(["%012d" %x for x in range(1000000)]) ...: d.info() <class 'pandas.core.frame.DataFrame'> Int64Index: 1000000 entries, 0 to 999999 Data columns (total 1 columns): 0 1000000 non-null object dtypes: object(1) memory usage: 15.3 MB