Pandas quantile function very slow · Issue #11623 · pandas-dev/pandas (original) (raw)

Skip to content

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sign up

Appearance settings

@pckapps

Description

@pckapps

The quantile function is almost 10 000 times slower than the equivalent percentile function in numpy. See code below:

import time import pandas as pd import numpy as np

q = np.array([0.1,0.4,0.6,0.9]) data = np.random.randn(10000, 4) df = pd.DataFrame(data, columns=['a', 'b', 'c', 'd']) time1 = time.time() pandas_quantiles = df.quantile(q, axis=1) time2 = time.time() print 'Pandas took %0.3f ms' % ((time2-time1)*1000.0)

time1 = time.time() numpy_quantiles = np.percentile(data, q*100, axis=1) time2 = time.time() print 'Numpy took %0.3f ms' % ((time2-time1)*1000.0)

print (pandas_quantiles.values == numpy_quantiles).all()

Output:

Pandas took 15337.531 ms

Numpy took 1.653 ms

True