What’s new in 1.0.2 (March 12, 2020) — pandas 2.2.3 documentation (original) (raw)
These are the changes in pandas 1.0.2. See Release notes for a full changelog including other versions of pandas.
Fixed regressions#
Groupby
- Fixed regression in DataFrameGroupBy.agg() and SeriesGroupBy.agg() which were failing on frames with MultiIndex columns and a custom function (GH 31777)
- Fixed regression in
groupby(..).rolling(..).apply()
(RollingGroupby
) where theraw
parameter was ignored (GH 31754) - Fixed regression in rolling(..).corr() when using a time offset (GH 31789)
- Fixed regression in groupby(..).nunique() which was modifying the original values if
NaN
values were present (GH 31950) - Fixed regression in
DataFrame.groupby
raising aValueError
from an internal operation (GH 31802) - Fixed regression in DataFrameGroupBy.agg() and SeriesGroupBy.agg() calling a user-provided function an extra time on an empty input (GH 31760)
I/O
- Fixed regression in read_csv() in which the
encoding
option was not recognized with certain file-like objects (GH 31819) - Fixed regression in DataFrame.to_excel() when the
columns
keyword argument is passed (GH 31677) - Fixed regression in ExcelFile where the stream passed into the function was closed by the destructor. (GH 31467)
- Fixed regression where read_pickle() raised a
UnicodeDecodeError
when reading a py27 pickle with MultiIndex column (GH 31988).
Reindexing/alignment
- Fixed regression in Series.align() when
other
is a DataFrame andmethod
is notNone
(GH 31785) - Fixed regression in DataFrame.reindex() and Series.reindex() when reindexing with (tz-aware) index and
method=nearest
(GH 26683) - Fixed regression in DataFrame.reindex_like() on a DataFrame subclass raised an
AssertionError
(GH 31925) - Fixed regression in DataFrame arithmetic operations with mis-matched columns (GH 31623)
Other
- Fixed regression in joining on DatetimeIndex or TimedeltaIndex to preserve
freq
in simple cases (GH 32166) - Fixed regression in Series.shift() with
datetime64
dtype when passing an integerfill_value
(GH 32591) - Fixed regression in the repr of an object-dtype Index with bools and missing values (GH 32146)
Indexing with nullable boolean arrays#
Previously indexing with a nullable Boolean array containing NA
would raise a ValueError
, however this is now permitted with NA
being treated as False
. (GH 31503)
In [1]: s = pd.Series([1, 2, 3, 4])
In [2]: mask = pd.array([True, True, False, None], dtype="boolean")
In [3]: s Out[3]: 0 1 1 2 2 3 3 4 Length: 4, dtype: int64
In [4]: mask Out[4]: [True, True, False, ] Length: 4, dtype: boolean
pandas 1.0.0-1.0.1
s[mask] Traceback (most recent call last): ... ValueError: cannot mask with array containing NA / NaN values
pandas 1.0.2
In [5]: s[mask] Out[5]: 0 1 1 2 Length: 2, dtype: int64
Bug fixes#
Datetimelike
- Bug in Series.astype() not copying for tz-naive and tz-aware
datetime64
dtype (GH 32490) - Bug where to_datetime() would raise when passed
pd.NA
(GH 32213) - Improved error message when subtracting two Timestamp that result in an out-of-bounds Timedelta (GH 31774)
Categorical
- Fixed bug where Categorical.from_codes() improperly raised a
ValueError
when passed nullable integer codes. (GH 31779) - Fixed bug where Categorical() constructor would raise a
TypeError
when given a numpy array containingpd.NA
. (GH 31927) - Bug in Categorical that would ignore or crash when calling Series.replace() with a list-like
to_replace
(GH 31720)
I/O
- Using
pd.NA
with DataFrame.to_json() now correctly outputs a null value instead of an empty object (GH 31615) - Bug in pandas.json_normalize() when value in meta path is not iterable (GH 31507)
- Fixed pickling of
pandas.NA
. Previously a new object was returned, which broke computations relying onNA
being a singleton (GH 31847) - Fixed bug in parquet roundtrip with nullable unsigned integer dtypes (GH 31896).
Experimental dtypes
- Fixed bug in DataFrame.convert_dtypes() for columns that were already using the
"string"
dtype (GH 31731). - Fixed bug in DataFrame.convert_dtypes() for series with mix of integers and strings (GH 32117)
- Fixed bug in DataFrame.convert_dtypes() where
BooleanDtype
columns were converted toInt64
(GH 32287) - Fixed bug in setting values using a slice indexer with string dtype (GH 31772)
- Fixed bug where DataFrameGroupBy.first(), SeriesGroupBy.first(), DataFrameGroupBy.last(), and SeriesGroupBy.last() would raise a
TypeError
when groups containedpd.NA
in a column of object dtype (GH 32123) - Fixed bug where
DataFrameGroupBy.mean()
,DataFrameGroupBy.median()
,DataFrameGroupBy.var()
, andDataFrameGroupBy.std()
would raise aTypeError
onInt64
dtype columns (GH 32219)
Strings
- Using
pd.NA
with Series.str.repeat() now correctly outputs a null value instead of raising error for vector inputs (GH 31632)
Rolling
- Fixed rolling operations with variable window (defined by time duration) on decreasing time index (GH 32385).
Contributors#
A total of 25 people contributed patches to this release. People with a “+” by their names contributed a patch for the first time.
- Anna Daglis +
- Daniel Saxton
- Irv Lustig
- Jan Škoda
- Joris Van den Bossche
- Justin Zheng
- Kaiqi Dong
- Kendall Masse
- Marco Gorelli
- Matthew Roeschke
- MeeseeksMachine
- MomIsBestFriend
- Pandas Development Team
- Pedro Reys +
- Prakhar Pandey
- Robert de Vries +
- Rushabh Vasani
- Simon Hawkins
- Stijn Van Hoey
- Terji Petersen
- Tom Augspurger
- William Ayd
- alimcmaster1
- gfyoung
- jbrockmendel