GitHub - pydata/bottleneck: Fast NumPy array functions written in C (original) (raw)

Bottleneck

Bottleneck is a collection of fast NumPy array functions written in C.

Let's give it a try. Create a NumPy array:

import numpy as np a = np.array([1, 2, np.nan, 4, 5])

Find the nanmean:

import bottleneck as bn bn.nanmean(a) 3.0

Moving window mean:

bn.move_mean(a, window=2, min_count=1) array([ 1. , 1.5, 2. , 4. , 4.5])

Benchmark

Bottleneck comes with a benchmark suite:

bn.bench() Bottleneck performance benchmark Bottleneck 1.3.0.dev0+122.gb1615d7; Numpy 1.16.4 Speed is NumPy time divided by Bottleneck time NaN means approx one-fifth NaNs; float64 used

          no NaN     no NaN      NaN       no NaN      NaN
           (100,)  (1000,1000)(1000,1000)(1000,1000)(1000,1000)
           axis=0     axis=0     axis=0     axis=1     axis=1

nansum 29.7 1.4 1.6 2.0 2.1 nanmean 99.0 2.0 1.8 3.2 2.5 nanstd 145.6 1.8 1.8 2.7 2.5 nanvar 138.4 1.8 1.8 2.8 2.5 nanmin 27.6 0.5 1.7 0.7 2.4 nanmax 26.6 0.6 1.6 0.7 2.5 median 120.6 1.3 4.9 1.1 5.7 nanmedian 117.8 5.0 5.7 4.8 5.5 ss 13.2 1.2 1.3 1.5 1.5 nanargmin 66.8 5.5 4.8 3.5 7.1 nanargmax 57.6 2.9 5.1 2.5 5.3 anynan 10.2 0.3 52.3 0.8 41.6 allnan 15.1 196.0 156.3 135.8 111.2 rankdata 45.9 1.2 1.2 2.1 2.1 nanrankdata 50.5 1.4 1.3 2.4 2.3 partition 3.3 1.1 1.6 1.0 1.5 argpartition 3.4 1.2 1.5 1.1 1.6 replace 9.0 1.5 1.5 1.5 1.5 push 1565.6 5.9 7.0 13.0 10.9 move_sum 2159.3 31.1 83.6 186.9 182.5 move_mean 6264.3 66.2 111.9 361.1 246.5 move_std 8653.6 86.5 163.7 232.0 317.7 move_var 8856.0 96.3 171.6 267.9 332.9 move_min 1186.6 13.4 30.9 23.5 45.0 move_max 1188.0 14.6 29.9 23.5 46.0 move_argmin 2568.3 33.3 61.0 49.2 86.8 move_argmax 2475.8 30.9 58.6 45.0 82.8 move_median 2236.9 153.9 151.4 171.3 166.9 move_rank 847.1 1.2 1.4 2.3 2.6

You can also run a detailed benchmark for a single function using, for example, the command:

bn.bench_detailed("move_median", fraction_nan=0.3)

Only arrays with data type (dtype) int32, int64, float32, and float64 are accelerated. All other dtypes result in calls to slower, unaccelerated functions. In the rare case of a byte-swapped input array (e.g. a big-endian array on a little-endian operating system) the function will not be accelerated regardless of dtype.

Where

License

Bottleneck is distributed under a Simplified BSD license. See the LICENSE file and LICENSES directory for details.

Install

Bottleneck provides binary wheels on PyPI for all the most common platforms. Binary packages are also available in conda-forge. We recommend installing binaries with pip, uv, conda or similar - it's faster and easier than building from source.

Installing from source

Requirements:

Bottleneck Python >=3.9; NumPy 1.16.0+
Compile gcc, clang, MinGW or MSVC
Unit tests pytest
Documentation sphinx, numpydoc

To install Bottleneck on Linux, Mac OS X, et al.:

To install bottleneck on Windows, first install MinGW and add it to your system path. Then install Bottleneck with the command:

$ python setup.py install --compiler=mingw32

Unit tests

After you have installed Bottleneck, run the suite of unit tests:

In [1]: import bottleneck as bn

In [2]: bn.test() ============================= test session starts ============================= platform linux -- Python 3.7.4, pytest-4.3.1, py-1.8.0, pluggy-0.12.0 hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/home/chris/code/bottleneck/.hypothesis/examples') rootdir: /home/chris/code/bottleneck, inifile: setup.cfg plugins: openfiles-0.3.2, remotedata-0.3.2, doctestplus-0.3.0, mock-1.10.4, forked-1.0.2, cov-2.7.1, hypothesis-4.32.2, xdist-1.26.1, arraydiff-0.3 collected 190 items

bottleneck/tests/input_modification_test.py ........................... [ 14%] .. [ 15%] bottleneck/tests/list_input_test.py ............................. [ 30%] bottleneck/tests/move_test.py ................................. [ 47%] bottleneck/tests/nonreduce_axis_test.py .................... [ 58%] bottleneck/tests/nonreduce_test.py .......... [ 63%] bottleneck/tests/reduce_test.py ....................................... [ 84%] ............ [ 90%] bottleneck/tests/scalar_input_test.py .................. [100%]

========================= 190 passed in 46.42 seconds ========================= Out[2]: True

If developing in the git repo, simply run py.test