What’s new in 2.0.1 (April 24, 2023) — pandas 2.2.3 documentation (original) (raw)
These are the changes in pandas 2.0.1. See Release notes for a full changelog including other versions of pandas.
Fixed regressions#
- Fixed regression for subclassed Series when constructing from a dictionary (GH 52445)
- Fixed regression in SeriesGroupBy.agg() failing when grouping with categorical data, multiple groupings,
as_index=False
, and a list of aggregations (GH 52760) - Fixed regression in DataFrame.pivot() changing Index name of input object (GH 52629)
- Fixed regression in DataFrame.resample() raising on a DataFrame with no columns (GH 52484)
- Fixed regression in DataFrame.sort_values() not resetting index when DataFrame is already sorted and
ignore_index=True
(GH 52553) - Fixed regression in
MultiIndex.isin()
raisingTypeError
forGenerator
(GH 52568) - Fixed regression in Series.describe() showing
RuntimeWarning
for extension dtype Series with one element (GH 52515) - Fixed regression when adding a new column to a DataFrame when the DataFrame.columns was a RangeIndex and the new key was hashable but not a scalar (GH 52652)
Bug fixes#
- Bug in Series.dt.days that would overflow
int32
number of days (GH 52391) - Bug in arrays.DatetimeArray constructor returning an incorrect unit when passed a non-nanosecond numpy datetime array (GH 52555)
- Bug in ArrowExtensionArray with duration dtype overflowing when constructed from data containing numpy
NaT
(GH 52843) - Bug in Series.dt.round() when passing a
freq
of equal or higher resolution compared to the Series would raise aZeroDivisionError
(GH 52761) - Bug in Series.median() with ArrowDtype returning an approximate median (GH 52679)
- Bug in api.interchange.from_dataframe() was unnecessarily raising on categorical dtypes (GH 49889)
- Bug in api.interchange.from_dataframe() was unnecessarily raising on large string dtypes (GH 52795)
- Bug in pandas.testing.assert_series_equal() where
check_dtype=False
would still raise for datetime or timedelta types with different resolutions (GH 52449) - Bug in read_csv() casting PyArrow datetimes to NumPy when
dtype_backend="pyarrow"
andparse_dates
is set causing a performance bottleneck in the process (GH 52546) - Bug in to_datetime() and to_timedelta() when trying to convert numeric data with a ArrowDtype (GH 52425)
- Bug in to_numeric() with
errors='coerce'
anddtype_backend='pyarrow'
with ArrowDtype data (GH 52588) - Bug in
ArrowDtype.__from_arrow__()
not respecting if dtype is explicitly given (GH 52533) - Bug in DataFrame.describe() not respecting
ArrowDtype
ininclude
andexclude
(GH 52570) - Bug in DataFrame.max() and related casting different Timestamp resolutions always to nanoseconds (GH 52524)
- Bug in Series.describe() not returning ArrowDtype with
pyarrow.float64
type with numeric data (GH 52427) - Bug in Series.dt.tz_localize() incorrectly localizing timestamps with ArrowDtype (GH 52677)
- Bug in arithmetic between
np.datetime64
andnp.timedelta64
NaT
scalars with units always returning nanosecond resolution (GH 52295) - Bug in logical and comparison operations between ArrowDtype and numpy masked types (e.g.
"boolean"
) (GH 52625) - Fixed bug in merge() when merging with
ArrowDtype
one one and a NumPy dtype on the other side (GH 52406) - Fixed segfault in Series.to_numpy() with
null[pyarrow]
dtype (GH 52443)
Other#
- DataFrame created from empty dicts had columns of dtype
object
. It is now a RangeIndex (GH 52404) - Series created from empty dicts had index of dtype
object
. It is now a RangeIndex (GH 52404) - Implemented Series.str.split() and Series.str.rsplit() for ArrowDtype with
pyarrow.string
(GH 52401) - Implemented most
str
accessor methods for ArrowDtype withpyarrow.string
(GH 52401) - Supplying a non-integer hashable key that tests
False
in api.types.is_scalar() now raises aKeyError
forRangeIndex.get_loc()
, like it does for Index.get_loc(). Previously it raised anInvalidIndexError
(GH 52652).
Contributors#
A total of 20 people contributed patches to this release. People with a “+” by their names contributed a patch for the first time.
- Alex Malins +
- Chris Carini +
- Dea María Léon
- Joris Van den Bossche
- Luke Manley
- Marc Garcia
- Marco Edward Gorelli
- MarcoGorelli
- Matthew Roeschke
- MeeseeksMachine
- Natalia Mokeeva
- Nirav +
- Pandas Development Team
- Patrick Hoefler
- Richard Shadrach
- Stefanie Molin
- Terji Petersen
- Thomas +
- Thomas Li
- yonashub