read_pickle much slower in v0.13 (not using cPickle when compat=False) · Issue #6899 · pandas-dev/pandas (original) (raw)

Explore
- GitHub Sponsors Fund open source developers
- The ReadME Project GitHub community articles
- Enterprise platform AI-powered developer platform
Pricing

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Description

See discussion here http://stackoverflow.com/questions/23122180/is-pandas-read-pickle-performance-crippled-in-version-0-13

My test dataset has pickled file size 1.6Gb, and contains about 13 million records.

With 0.12 the file takes 146s to load; with 0.13 is takes 982s (about 6.7x longer).

Using cProfile, you can see that v0.13 always uses native python pickle to load, even when compat=False. In 0.12 it uses cPickle to load. Seems like something is wrong with the logic in pandas/compat/pickle_compat.py

A workaround (if you know you don't need compatibility mode) is to use cPickle.load(open('foo.pickle')) instead of pandas.read_pickle('foo.pickle').

read_pickle much slower in v0.13 (not using cPickle when compat=False) · Issue #6899 · pandas-dev/pandas (original) (raw)

Navigation Menu

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Description