BUG: Assorted DatetimeIndex bugfixes by jbrockmendel · Pull Request #24157 · pandas-dev/pandas (original) (raw)

Conversation

This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters

[ Show hidden characters]({{ revealButtonHref }})

@jbrockmendel

@pep8speaks

Hello @jbrockmendel! Thanks for submitting the PR.

@jbrockmendel

@jbrockmendel

@codecov

@codecov

@jbrockmendel

@jbrockmendel

Azure fail is in developer.rst, unrelated.

@jbrockmendel

@jbrockmendel

@datapythonista is the azure fail caused by something in this PR? It looks unrelated but re-pushing didn’t fix it.

@datapythonista

master was broken (I merged an old PR that didn't have some linting), but it's been fixed in #24160

If you merge master and push again, the CI should pass now.

@jbrockmendel

jreback

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls add to #6581 for the deprecation

"numpy.newaxis (`None`) and integer or boolean "
"arrays are valid indices")
if key is Ellipsis:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you make a more explicit comment. This is likely a more general fix (IOW this is likely broken for all __getitem__ on EA)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why don't this just work as is? why is this not a view (and not a copy)?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why don't this just work as is?

as-is DatetimeIndex[...] loses its freq attribute

why is this not a view (and not a copy)?

Since the copy doesn't have deep=True, self.copy() effectively a view?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you be more explicit in your comment, this is not obvious

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved to a hopefully clearer spot a few lines down

ci = pd.CategoricalIndex(dti)
carr = pd.Categorical(dti)
cser = pd.Series(ci)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you parametrize

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will take a look

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not without adding casting code that makes the verbosity a wash.

Once Categorical(DatetimeArray) works (currently raises bc DatetimeArray doesn't have _constructor) then this test can be parametrized over DatetimeIndex/DatetimeArray

@jbrockmendel

pls add to #6581 for the deprecation

Will do. Actually, is this private enough that a deprecation cycle is unnecessary? If so we should just remove the arg now and move the whole func to arrays.datetimes, since that is the only place it is used

@jbrockmendel

@jbrockmendel

TomAugspurger

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

I think it'll apply cleanly aside from imports.

jreback

"numpy.newaxis (`None`) and integer or boolean "
"arrays are valid indices")
if key is Ellipsis:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why don't this just work as is? why is this not a view (and not a copy)?

jreback

"numpy.newaxis (`None`) and integer or boolean "
"arrays are valid indices")
if key is Ellipsis:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you be more explicit in your comment, this is not obvious

dates : generator object
"""
if time_rule is not None:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you need to change any existing tests to account for this?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, tests.indexes.datetimes.test_date_range test_generate and test_generate_cday

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this sufficiently internal that we don't need to do a deprecation cycle?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, let's just drop it entirely (fix the tests). I don't think generate_range is exposed, though maybe add in a note in api breaking changes just to cover.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do. OK with moving this to arrays.datetimes while we're at it?

jreback

- In :meth:`Series.where` with Categorical data, providing an ``other`` that is not present in the categories is deprecated. Convert the categorical to a different dtype or add the ``other`` to the categories first (:issue:`24077`).
- :meth:`Series.clip_lower`, :meth:`Series.clip_upper`, :meth:`DataFrame.clip_lower` and :meth:`DataFrame.clip_upper` are deprecated and will be removed in a future version. Use ``Series.clip(lower=threshold)``, ``Series.clip(upper=threshold)`` and the equivalent ``DataFrame`` methods (:issue:`24203`)
- Passing a `time_rule` to `pandas.tseries.offsets.generate_range` is deprecated and will raise a ``TypeError`` in a future version. Pass an ``offset`` instead (:issue:`24157`)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these don't jive with what's at the top of the PR. missing 2 notes?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it merit a note? All that is changed is an error message. But yes, this does close 11587, I'll update this above.

# non-fixed frequencies are not meaningful for timedelta64;
# we retain that error message
raise e
# GH#11587 if index[0] is NaT _generate_range will raise

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is too cryptic, pls make more readable if you don't actually know what's happening here

elif is_categorical_dtype(data):
# GH#18664 preserve tz in going DTI->Categorical->DTI
# TODO: cases where we need to do another pass through this func,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is the TODO case here, what is an example? do you have an xfailed test?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

example is Categorical(TimedeltaIndex) where we should do another pass through maybe_convert_dtype in order to issue the FutureWarning. I'm assuming there are other corner cases here that will need to be caught/tested.

jreback

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small comments, ping on green.

- ``max_rows`` and ``max_cols`` parameters removed from :class:`HTMLFormatter` since truncation is handled by :class:`DataFrameFormatter` (:issue:`23818`)
- :meth:`read_csv` will now raise a ``ValueError`` if a column with missing values is declared as having dtype ``bool`` (:issue:`20591`)
- The column order of the resultant :class:`DataFrame` from :meth:`MultiIndex.to_frame` is now guaranteed to match the :attr:`MultiIndex.names` order. (:issue:`22420`)
- :func:`pd.offsets.generate_range` argument ``time_rule`` has been removed; use ``offset`` instead (:issue:`24157`)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this probably won't render as its not in the api.rst but ok

- Bug in :class:`Index` where calling ``np.array(dtindex, dtype=object)`` on a timezone-naive :class:`DatetimeIndex` would return an array of ``datetime`` objects instead of :class:`Timestamp` objects, potentially losing nanosecond portions of the timestamps (:issue:`23524`)
- Bug in :class:`Categorical.__setitem__` not allowing setting with another ``Categorical`` when both are undordered and have the same categories, but in a different order (:issue:`24142`)
- Bug in :func:`date_range` where using dates with millisecond resolution or higher could return incorrect values or the wrong number of values in the index (:issue:`24110`)
- Bug in :class:`DatetimeIndex` where constructing a :class:`DatetimeIndex` from a :class:`Categorical` or :class:`CategoricalIndex` would incorrectly drop timezone information (:issue:`18664`)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where is the note for #11587?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just pushed, should be just below this line.

@jbrockmendel

jreback

from datetime import date, datetime, timedelta
import functools
import operator
import warnings

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prob going to fail the build as this is not needed

jreback

@jreback

linting issue. ping on green.

@jbrockmendel

@jbrockmendel

@jreback

Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this pull request

Feb 28, 2019

@jbrockmendel @Pingviinituutti

Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this pull request

Feb 28, 2019

@jbrockmendel @Pingviinituutti

Labels