Notebook on nbviewer (original) (raw)

Notebook

A numerical integration recipe as discussed here.

In [103]:

from scipy import integrate import pandas as pd import numpy as np

def integrate_method(self, how='trapz', unit='s'): '''Numerically integrate the time series.

@param how: the method to use (trapz by default)
@return 

Available methods:
 * trapz - trapezoidal
 * cumtrapz - cumulative trapezoidal
 * simps - Simpson's rule
 * romb - Romberger's rule

See http://docs.scipy.org/doc/scipy/reference/integrate.html for the method details.
or the source code
https://github.com/scipy/scipy/blob/master/scipy/integrate/quadrature.py
'''
available_rules = set(['trapz', 'cumtrapz', 'simps', 'romb'])
if how in available_rules:
    rule = integrate.__getattribute__(how)
else:
    print('Unsupported integration rule: %s' % (how))
    print('Expecting one of these sample-based integration rules: %s' % (str(list(available_rules))))
    raise AttributeError

result = rule(self.values, self.index.astype(np.int64) / 10**9)
#result = rule(self.values)
return result

pd.TimeSeries.integrate = integrate_method

Example of usage

Sample data

In [96]:

x = np.abs(np.random.randn(10000)) ts = pd.Series(x, pd.date_range(start='2013-05-03', periods=len(x), freq='s'))

Different methods:

Out[65]:

array([ 1.50979200e+00, 2.36680665e+00, 2.91839588e+00, ..., 7.94300517e+03, 7.94440820e+03, 7.94510764e+03])

In [72]:

ts[:1+2**3].integrate('romb')

Out[72]:

array([ 8.43726871e+09, 8.43726872e+09, 8.43726873e+09, 8.43726873e+09, 8.43726874e+09, 8.43726875e+09, 8.43726875e+09, 8.43726876e+09, 8.43726876e+09])

Testing for correctness on predictable series

A straight line covering an area of 12 * 2 = 24

In [104]:

twos = pd.Series(2, pd.date_range(start='2012-01-23', periods=13, freq='1s')) twos.plot() print(twos.integrate('simps'))

We should get the same result even if we sampled the data e.g. every 3 seconds

In [105]:

twos_sparse = pd.Series(2, pd.date_range(start='2012-01-23 00:00', end='2012-01-23 00:12', freq='3s')) twos_sparse.plot() print(twos_sparse.integrate('simps'))

A diagonal, closing a triangle 12 * 12 / 2 = 72

In [89]:

lin = pd.Series(np.arange(13), pd.date_range(start='2012-01-23', periods=13, freq='1s')) lin.plot() print(lin.integrate())

Benchmarking

Overcomes the bottleneck illustrated here

1000 loops, best of 3: 237 µs per loop

Tests


AttributeError Traceback (most recent call last) in () ----> 1 ts.integrate('homers')

in integrate_method(self, how, unit) 24 print('Unsupported integration rule: %s' % (how)) 25 print('Expecting one of these sample-based integration rules: %s' % (str(list(available_rules)))) ---> 26 raise AttributeError 27 28 #result = integrate.trapz(ts.values, ts.index.astype(np.int64) / 10**9)

AttributeError:

Unsupported integration rule: homers Expecting one of these sample-based integration rules: ['romb', 'simps', 'trapz', 'cumtrapz']