Version 0.9.0 (October 7, 2012) — pandas 3.0.0rc0+31.g944c527c0a documentation (original) (raw)
This is a major release from 0.8.1 and includes several new features and enhancements along with a large number of bug fixes. New features include vectorized unicode encoding/decoding for Series.str, to_latex method to DataFrame, more flexible parsing of boolean values, and enabling the download of options data from Yahoo! Finance.
New features#
- Add
encodeanddecodefor unicode handling to vectorized string processing methods in Series.str (GH 1706)- Add
DataFrame.to_latexmethod (GH 1735)- Add convenient expanding window equivalents of all rolling_* ops (GH 1785)
- Add Options class to pandas.io.data for fetching options data from Yahoo! Finance (GH 1748, GH 1739)
- More flexible parsing of boolean values (Yes, No, TRUE, FALSE, etc) (GH 1691, GH 1295)
- Add
levelparameter toSeries.reset_indexTimeSeries.between_timecan now select times across midnight (GH 1871)- Series constructor can now handle generator as input (GH 1679)
DataFrame.dropnacan now take multiple axes (tuple/list) as input (GH 924)- Enable
skip_footerparameter inExcelFile.parse(GH 1843)
API changes#
- The default column names when
header=Noneand no columns names passed to functions likeread_csvhas changed to be more Pythonic and amenable to attribute access:
In [1]: import io
In [2]: data = """ ...: 0,0,1 ...: 1,1,0 ...: 0,1,0 ...: """ ...:
In [3]: df = pd.read_csv(io.StringIO(data), header=None)
In [4]: df Out[4]: 0 1 2 0 0 0 1 1 1 1 0 2 0 1 0
- Creating a Series from another Series, passing an index, will cause reindexing to happen inside rather than treating the Series like an ndarray. Technically improper usages like
Series(df[col1], index=df[col2])that worked before “by accident” (this was never intended) will lead to all NA Series in some cases. To be perfectly clear:
In [5]: s1 = pd.Series([1, 2, 3])
In [6]: s1 Out[6]: 0 1 1 2 2 3 dtype: int64
In [7]: s2 = pd.Series(s1, index=["foo", "bar", "baz"])
In [8]: s2 Out[8]: foo NaN bar NaN baz NaN dtype: float64
- Deprecated
day_of_yearAPI removed from PeriodIndex, usedayofyear(GH 1723) - Don’t modify NumPy suppress printoption to True at import time
- The internal HDF5 data arrangement for DataFrames has been transposed. Legacy files will still be readable by HDFStore (GH 1834, GH 1824)
- Legacy cruft removed: pandas.stats.misc.quantileTS
- Use ISO8601 format for Period repr: monthly, daily, and on down (GH 1776)
- Empty DataFrame columns are now created as object dtype. This will prevent a class of TypeErrors that was occurring in code where the dtype of a column would depend on the presence of data or not (e.g. a SQL query having results) (GH 1783)
- Setting parts of DataFrame/Panel using ix now aligns input Series/DataFrame (GH 1630)
firstandlastmethods inGroupByno longer drop non-numeric columns (GH 1809)- Resolved inconsistencies in specifying custom NA values in text parser.
na_valuesof type dict no longer override default NAs unlesskeep_default_nais set to false explicitly (GH 1657) DataFrame.dotwill not do data alignment, and also work with Series (GH 1915)
See the full release notes or issue tracker on GitHub for a complete list.
Contributors#
A total of 24 people contributed patches to this release. People with a “+” by their names contributed a patch for the first time.
- Chang She
- Christopher Whelan +
- Dan Miller +
- Daniel Shapiro +
- Dieter Vandenbussche
- Doug Coleman +
- John-Colvin +
- Johnny +
- Joshua Leahy +
- Lars Buitinck +
- Mark O’Leary +
- Martin Blais
- MinRK +
- Paul Ivanov +
- Skipper Seabold
- Spencer Lyon +
- Taavi Burns +
- Wes McKinney
- Wouter Overmeire
- Yaroslav Halchenko
- lenolib +
- tshauck +
- y-p +
- Øystein S. Haaland +