DEPR: Deprecate pandas.core.datetools by gfyoung · Pull Request #14105 · pandas-dev/pandas (original) (raw)

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Conversation49 Commits5 Checks0 Files changed

Conversation

This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters

[ Show hidden characters]({{ revealButtonHref }})

gfyoung

Title is self-explanatory. Closes #14094.

@codecov-io

Current coverage is 85.26% (diff: 78.26%)

Merging #14105 into master will decrease coverage by <.01%

@@ master #14105 diff @@

Files 139 140 +1
Lines 50562 50604 +42
Methods 0 0
Messages 0 0
Branches 0 0

Powered by Codecov. Last update 59524af...2bb887d

@jorisvandenbossche

Can you leave out the changes in the benchmarks? (Avoiding conflicts with my pr, i will incorporate changes there)

@gfyoung

@jorisvandenbossche : Ah, I didn't see #14099. Sure thing. Once I get Travis to pass, I'll remove the commit. In the meantime, I'll add a reminder to yours.

jreback

deprecated_classes_in_future = ['Term', 'Panel']
# these should be removed from top-level namespace
remove_classes_from_top_level_namespace = ['Expr']
# external modules exposed in pandas namespace
modules = ['np', 'datetime', 'datetools']
modules = ['np', 'datetime', 'sys', 'warnings']

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no don't add

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Saying "don't add" doesn't help since I needed to add them to avoid test failures.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I already told u this on the issue

you are polluting the namespace

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, but you provided no solution to not do it given my situation. Saying "don't do something" with no suggestion of a solution is not helpful whatsoever.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are writing the pr
I am telling you that this is wrong
that is helpful
it's up to you fix it
if you see later I do tell you how though
by using a separate module

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not helpful because I don't know how to proceed with it. Sure, you did provide comments below that provide further guidance, but in isolation, "fixing it" is far from clear.

@gfyoung

jreback

self.removals = removals
def __getattr__(self, name):
if name in dir(self.__class__):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

set this on creation as this is a list so checking will be expensive (make it a set)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's upgrade to a frozenset. Done.

@gfyoung

@jreback

need to add __dir__ to _DeprecatedModule to get something reasonable, IOW the list of removals should be there and the namespace imported from the alts.

In [11]: pd.datetools.alts
Out[11]: 
frozenset({'pandas.tseries.frequencies',
           'pandas.tseries.offsets',
           'pandas.tseries.tools'})

In [12]: pd.datetools.removals
Out[12]: 
frozenset({'bday',
           'bmonthBegin',
           'bmonthEnd',
           'bquarterEnd',
           'businessDay',
           'byearEnd',
           'cbmonthBegin',
           'cbmonthEnd',
           'cday',
           'customBusinessDay',
           'customBusinessMonthBegin',
           'customBusinessMonthEnd',
           'day',
           'monthEnd',
           'quarterEnd',
           'week',
           'yearBegin',
           'yearEnd'})

@jreback

further I am not sure this actually works

In [1]: dir(pandas.datetools)
Out[1]: 
['__class__',
 '__delattr__',
 '__dict__',
 '__doc__',
 '__format__',
 '__getattr__',
 '__getattribute__',
 '__hash__',
 '__init__',
 '__module__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 '__weakref__',
 'alts',
 'deprmod',
 'removals',
 'self_dir']

In [17]: from pandas.datetools import to_datetime
---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
<ipython-input-17-014abe5066ec> in <module>()
----> 1 from pandas.datetools import to_datetime

ImportError: No module named datetools

something is wrong

@jreback

pls add some tests to verify the actual deprecations themselves (e.g. that you can import the previous things, or a sample of them)

@gfyoung

@jreback : Have you tried running from pandas.datetools import to_datetime on master? It doesn't work either. I don't think that's a problem.

@gfyoung

@jreback : I don't want to overload __dir__ as you described because then I can't differentiate methods that apart of the class itself and methods that are meant to be in removals OR alts. That's the purpose of my first check in __getattr__.

@gfyoung

@jreback : Don't fully understand your comments about testing. Aren't I doing that already?

@jreback

In [2]: pd.__version__
Out[2]: '0.18.1+425.gd26363b'

In [3]: dir(pd.datetools)[0:10]
Out[3]: 
['ABCDataFrame',
 'ABCIndexClass',
 'ABCSeries',
 'AmbiguousTimeError',
 'BDay',
 'BMonthBegin',
 'BMonthEnd',
 'BQuarterBegin',
 'BQuarterEnd',
 'BYearBegin']

@jreback

@gfyoung you can easily override __dir__, you know whats in alt/ removals (well you know as soon as you introspect them), which you can do lazily, e.g. when its needed

@gfyoung

@jreback : Ah, that's a fair point. I can just first check if it's in dir and then introspect.

@jreback

yep. The ideal thing is to replicate as much as possible the existing behavior (and just show a depreaction warning).

@gfyoung

@jorisvandenbossche

@gfyoung The consequence of using a frozenset for the alts is that they have no longer a preferred order. For example, you now get the message to Please use pandas.tseries.frequencies.xxx instead. for many of the offsets, while in our docs / internals we import those from pandas.tseries.offsets

@gfyoung

@jorisvandenbossche : That isn't a property frozenset, but rather one of set in particular. set is unordered by definition. However, your point about the "wrong" import being used is indicative of namespace pollution within each module.

What would you suggest I do then?

@jorisvandenbossche

I didn't want to imply it was a consequence of the 'frozen' aspect of the sets :-). But initially you used a list I think? And performance was the reason to change it to set? Simply using a list would keep the preferred order.
Personally, my preference in this case would go for correctness rather than performance (although what is correct is also debatable ... as you noted correctly the polluted namespaces between frequencies and offsets), alts is in this case a list/set of 3 elements, the in check for that will not be that critical here IMO

@gfyoung

@jorisvandenbossche : It was a general performance consideration that @jreback brought up. I originally used a list, but then he pointed out that set is faster.

While I do see your point about the imports and how they should come from specific places to be consistent with documentation, IMO the code is correct as is and shouldn't have to tailor to the namespace pollution that I pointed out.

@jreback

@gfyoung its a bit more work, but you can figure out where the imports are actually from, e.g. you can actually do the from pandas.tseries.tools import * (for each of the removals), then creating a mapping from the attr to the import. I know its a pain, but I think its necessary.

I actually did something like this (for some code I am working on which is old). I know where things are, and it still took some time / trial and error to figure out the correct imports :)

not to mention the monthEnd = MonthEnd() is really odd (though it IS kind of like a singelton)

@gfyoung

@jreback : from <tseries module> import * won't help since it still will import all other namespaces that are polluting it IINM.

@jorisvandenbossche

I know it is not generic, but just using a list for alts instead of set works for this case. I don't see the need to make it more complicated than that.

@jorisvandenbossche

Actually, can't we get the information from the object itself? (of course in this case where we want the full path it will give the right thing (frequencies or offsets), it will also not work generically for all imports where more top-level paths are used).

In [23]: getattr(pd.dateools, 'BDay')
Out[23]: pandas.tseries.offsets.BusinessDay

gives you that it should be 'offsets' and not frequencies

@gfyoung

@jorisvandenbossche : How do you extract the full class path from this?

getattr(pd.dateools, 'BDay') <class 'pandas.tseries.offsets.BusinessDay'>

Perhaps there's an obvious way, but I don't see one.

If there isn't, then we might need to switch back to a list or the OrderedSet cookbook here 😄

@jreback : Thoughts?

@jorisvandenbossche

__module__ gives me the path:

In [40]: obj = getattr(pd.datetools, 'BDay')

In [41]: obj.__module__
Out[41]: 'pandas.tseries.offsets'

@gfyoung

@jorisvandenbossche

I am going to merge this, we can fix the actual depr warnings with correct modules in a follow-up PR, but then at least the deprecations are included in the rc

gfyoung added a commit to forking-repos/pandas that referenced this pull request

Sep 9, 2016

@gfyoung

Follow-up to pandas-devgh-14105. Uses the 'module' method to correctly determine the location of the alternative module to use.

trbs added a commit to trbs/pandas that referenced this pull request

Sep 12, 2016

@trbs

…eter

trbs added a commit to trbs/pandas that referenced this pull request

Sep 12, 2016

@trbs

jorisvandenbossche pushed a commit that referenced this pull request

Sep 14, 2016

@gfyoung @jorisvandenbossche

Follow-up to gh-14105. Uses the 'module' method to correctly determine the location of the alternative module to use.