Version 0.9.1 (November 14, 2012) — pandas 2.2.3 documentation (original) (raw)

This is a bug fix release from 0.9.0 and includes several new features and enhancements along with a large number of bug fixes. The new features include by-column sort order for DataFrame and Series, improved NA handling for the rank method, masking functions for DataFrame, and intraday time-series filtering for DataFrame.

New features#

0 0.276232 -1.087401 -0.673690
1 0.113648 -1.478427 0.524988
2 0.404705 0.577046 -1.715002
3 -1.039268 -0.370647 -1.157892
4 -1.344312 0.844885 1.075770

[5 rows x 3 columns]

In [8]: df[df['A'] > 0]
Out[8]:
A B C
0 0.276232 -1.087401 -0.673690
1 0.113648 -1.478427 0.524988
2 0.404705 0.577046 -1.715002

[3 rows x 3 columns]

If a DataFrame is sliced with a DataFrame based boolean condition (with the same size as the original DataFrame), then a DataFrame the same size (index and columns) as the original is returned, with elements that do not meet the boolean condition as NaN. This is accomplished via the new method DataFrame.where. In addition, where takes an optional other argument for replacement.

In [9]: df[df > 0]
Out[9]:
A B C
0 0.276232 NaN NaN
1 0.113648 NaN 0.524988
2 0.404705 0.577046 NaN
3 NaN NaN NaN
4 NaN 0.844885 1.075770

[5 rows x 3 columns]

In [10]: df.where(df > 0)
Out[10]:
A B C
0 0.276232 NaN NaN
1 0.113648 NaN 0.524988
2 0.404705 0.577046 NaN
3 NaN NaN NaN
4 NaN 0.844885 1.075770

[5 rows x 3 columns]

In [11]: df.where(df > 0, -df)
Out[11]:
A B C
0 0.276232 1.087401 0.673690
1 0.113648 1.478427 0.524988
2 0.404705 0.577046 1.715002
3 1.039268 0.370647 1.157892
4 1.344312 0.844885 1.075770

[5 rows x 3 columns]

Furthermore, where now aligns the input boolean condition (ndarray or DataFrame), such that partial selection with setting is possible. This is analogous to partial setting via .ix (but on the contents rather than the axis labels)

In [12]: df2 = df.copy()

In [13]: df2[df2[1:4] > 0] = 3

In [14]: df2
Out[14]:
A B C
0 0.276232 -1.087401 -0.673690
1 3.000000 -1.478427 3.000000
2 3.000000 3.000000 -1.715002
3 -1.039268 -0.370647 -1.157892
4 -1.344312 0.844885 1.075770

[5 rows x 3 columns]

DataFrame.mask is the inverse boolean operation of where.

In [15]: df.mask(df <= 0)
Out[15]:
A B C
0 0.276232 NaN NaN
1 0.113648 NaN 0.524988
2 0.404705 0.577046 NaN
3 NaN NaN NaN
4 NaN 0.844885 1.075770

[5 rows x 3 columns]

API changes#

0 00001 1 5
1 00002 2 6
[2 rows x 3 columns]

See the full release notes or issue tracker on GitHub for a complete list.

Contributors#

A total of 11 people contributed patches to this release. People with a “+” by their names contributed a patch for the first time.