Version 0.9.0 (October 7, 2012) — pandas 3.0.0rc0+31.g944c527c0a documentation (original) (raw)

This is a major release from 0.8.1 and includes several new features and enhancements along with a large number of bug fixes. New features include vectorized unicode encoding/decoding for Series.str, to_latex method to DataFrame, more flexible parsing of boolean values, and enabling the download of options data from Yahoo! Finance.

New features#

Add encode and decode for unicode handling to vectorized string processing methods in Series.str (GH 1706)

Add DataFrame.to_latex method (GH 1735)

Add convenient expanding window equivalents of all rolling_* ops (GH 1785)

Add Options class to pandas.io.data for fetching options data from Yahoo! Finance (GH 1748, GH 1739)

More flexible parsing of boolean values (Yes, No, TRUE, FALSE, etc) (GH 1691, GH 1295)

Add level parameter to Series.reset_index

TimeSeries.between_time can now select times across midnight (GH 1871)

Series constructor can now handle generator as input (GH 1679)

DataFrame.dropna can now take multiple axes (tuple/list) as input (GH 924)

Enable skip_footer parameter in ExcelFile.parse (GH 1843)

API changes#

The default column names when header=None and no columns names passed to functions like read_csv has changed to be more Pythonic and amenable to attribute access:

In [1]: import io

In [2]: data = """ ...: 0,0,1 ...: 1,1,0 ...: 0,1,0 ...: """ ...:

In [3]: df = pd.read_csv(io.StringIO(data), header=None)

In [4]: df Out[4]: 0 1 2 0 0 0 1 1 1 1 0 2 0 1 0

Creating a Series from another Series, passing an index, will cause reindexing to happen inside rather than treating the Series like an ndarray. Technically improper usages like Series(df[col1], index=df[col2]) that worked before “by accident” (this was never intended) will lead to all NA Series in some cases. To be perfectly clear:

In [5]: s1 = pd.Series([1, 2, 3])

In [6]: s1 Out[6]: 0 1 1 2 2 3 dtype: int64

In [7]: s2 = pd.Series(s1, index=["foo", "bar", "baz"])

In [8]: s2 Out[8]: foo NaN bar NaN baz NaN dtype: float64

Deprecated day_of_year API removed from PeriodIndex, use dayofyear(GH 1723)
Don’t modify NumPy suppress printoption to True at import time
The internal HDF5 data arrangement for DataFrames has been transposed. Legacy files will still be readable by HDFStore (GH 1834, GH 1824)
Legacy cruft removed: pandas.stats.misc.quantileTS
Use ISO8601 format for Period repr: monthly, daily, and on down (GH 1776)
Empty DataFrame columns are now created as object dtype. This will prevent a class of TypeErrors that was occurring in code where the dtype of a column would depend on the presence of data or not (e.g. a SQL query having results) (GH 1783)
Setting parts of DataFrame/Panel using ix now aligns input Series/DataFrame (GH 1630)
first and last methods in GroupBy no longer drop non-numeric columns (GH 1809)
Resolved inconsistencies in specifying custom NA values in text parser.na_values of type dict no longer override default NAs unlesskeep_default_na is set to false explicitly (GH 1657)
DataFrame.dot will not do data alignment, and also work with Series (GH 1915)

See the full release notes or issue tracker on GitHub for a complete list.

Contributors#

A total of 24 people contributed patches to this release. People with a “+” by their names contributed a patch for the first time.

Chang She
Christopher Whelan +
Dan Miller +
Daniel Shapiro +
Dieter Vandenbussche
Doug Coleman +
John-Colvin +
Johnny +
Joshua Leahy +
Lars Buitinck +
Mark O’Leary +
Martin Blais
MinRK +
Paul Ivanov +
Skipper Seabold
Spencer Lyon +
Taavi Burns +
Wes McKinney
Wouter Overmeire
Yaroslav Halchenko
lenolib +
tshauck +
y-p +
Øystein S. Haaland +