PERF: improve performance of NDFrame.describe by DataOmbudsman · Pull Request #21274 · pandas-dev/pandas (original) (raw)
Sure. Thanks for the suggestion. Here are my ASV benchmarks. These also show the improvement.
Setup
class Describe(object):
goal_time = 0.2
def setup(self):
np.random.seed(123)
self.df = DataFrame({
'a': np.random.randint(0, 100, int(1e6)),
'b': np.random.randint(0, 100, int(1e6)),
'c': np.random.randint(0, 100, int(1e6)),
})
def time_series_describe(self):
self.df['a'].describe()
def time_dataframe_describe(self):
self.df.describe()
Results
before | after | ratio | |
---|---|---|---|
689±10ms | 495±6ms | 0.72 | frame_methods.Describe.time_dataframe_describe |
234±9ms | 166±6ms | 0.71 | frame_methods.Describe.time_series_describe |