ENH/BUG: Offset.apply dont preserve time by sinhrks · Pull Request #7375 · pandas-dev/pandas (original) (raw)
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Conversation27 Commits1 Checks0 Files changed
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters
[ Show hidden characters]({{ revealButtonHref }})
@@ -1208,7 +1395,7 @@ def test_offset(self): |
---|
def test_normalize(self): |
dt = datetime(2007, 1, 1, 3) |
result = dt + BMonthEnd() |
result = dt + BMonthEnd(normalize=True) |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this an API change?
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Depends on definition. Whether to regard BMonthEnd
and MonthEnd
inconsistencies are spec or bug. From users point of view, I think it looks a bug as most of other offsets behave differently.
sinhrks changed the title
ENH/BUG: Offset.aplly dont preserve time ENH/BUG: Offset.apply dont preserve time
@@ -115,8 +115,19 @@ Enhancements |
---|
- Implemented ``sem`` (standard error of the mean) operation for ``Series``, |
``DataFrame``, ``Panel``, and ``Groupby`` (:issue:`6897`) |
- All ``offsets`` suppports ``normalize`` keyword to specify whether ``offsets.apply``, ``rollforward`` and ``rollback`` resets time (hour, minute, etc) or not (default ``False``, preserves time) (:issue:`7156`) |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why don't you move these to API section (they technically are a missing 'feature', but just to make it obvious)
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks fine, pls move to API section, then good 2 go
@jreback Moved the description to API, and rebased.
jreback added a commit that referenced this pull request
ENH/BUG: Offset.apply dont preserve time
@jreback While trying to clean up the whatsnew page for 0.14.1, I wanted to rewrite the api change on the offsets a bit (as it is now not clear what did change), but I stumbled on this:
In [26]: d
Out[26]: Timestamp('2014-01-01 09:00:00')
In [27]: d + offsets.MonthEnd()
Out[27]: Timestamp('2014-01-31 09:00:00')
In [28]: d + offsets.MonthEnd(normalize=True)
Out[28]: Timestamp('2014-01-31 00:00:00')
In [29]: d + offsets.Day()
Out[29]: Timestamp('2014-01-02 09:00:00')
In [30]: d + offsets.Day(normalize=True)
Out[30]: Timestamp('2014-01-02 09:00:00')
In [31]: d + offsets.MonthOffset()
Out[31]: Timestamp('2014-01-02 09:00:00')
So for MonthEnd
it seems to work as expected, but with Day
normalize
has no effect, and MonthOffset
gives the same as Day
. Are these bugs, or just due to my misunderstanding (I have to say that I almost never use offsets directly)
yes it looks like a bug with Day which is handled in Tick
Yes, it is a bug. Currently normalize
is effective only if addition or subtraction internally calls apply
. Fixed in #7697.
And the MonthOffset
issue?
And yet something else:
In [55]: offsets.YearOffset()
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-55-b53e59b753d1> in <module>()
----> 1 offsets.YearOffset()
c:\users\vdbosscj\scipy\pandas-joris\pandas\tseries\offsets.pyc in __init__(self
, n, normalize, **kwds)
1331
1332 def __init__(self, n=1, normalize=False, **kwds):
-> 1333 self.month = kwds.get('month', self._default_month)
1334
1335 if self.month < 1 or self.month > 12:
AttributeError: 'YearOffset' object has no attribute '_default_month'
can u separate out the fix for this into a new pr?
@jreback Sure, will do shortly.
@jorisvandenbossche YearOffset
looks an wrapper. It doesn't define effective apply
and other methods. Maybe instanciation should be prohibited?
@sinhrks In theory YearOffset
should work like MonthOffset
. open a new bug report for that pls.
OK. Is it correct DateOffset
, MonthOffset
and YearOffset
can be used in itselves (even though not included in pd.offsets.__all__
). The result should be a timestamp increased one unit of the day, month and year?
@sinhrks that's correct (I personally find that pretty confusing), as ts + 1
does this.
so its a semi-private method (and one should really use MonthEnd
et all)
This was referenced
Jul 9, 2014
I created seperate issues for the YearOffset and MonthOffset issues I reported here. But do I understand correctly that they are not really 'meant to be used' like this?
As I expect MonthOffset
to do the same as DateOffset(months=1)
(and this is something different as MonthEnd
):
In [76]: d + offsets.DateOffset(months=1)
Out[76]: Timestamp('2014-02-01 09:00:00')
In [77]: d + offsets.MonthOffset()
Out[77]: Timestamp('2014-01-02 09:00:00')
In [78]: d + offsets.MonthEnd()
Out[78]: Timestamp('2014-01-31 00:00:00')
@jorisvandenbossche its iffy. I don't see any explicty usage of them in docs/otherwise
I think the idea was that once could have it programatically select the class, e.g.
DateOffset(months=2)
-> MonthEnd(2)
which does seem useful. Maybe need to figure out use/doc it (or clearly mark it as semi-private).
I was just updating my comment above: DateOffset(months=2)
is not the same as MonthEnd(2)
! And I would expect that MonthOffset
would be.
sorry, I mean Month(2)
, e.g. just advance me 2 months.
but Month
does not exist? You only have MonthBegin
, MonthEnd
, MonthOffset
in pd.tseries.offsets
Month
is de-facto MonthOffset(months=n)
The problem it is that
Timestamp('20130105') + pd.offsets.MonthOffset(months=1)
works! but pd.offsets.MonthOffset(1)
is de-factor a 1-day increase as the default arg is day
aha, so the __init__
shouldbe overwritten in MonthOffset
?
(and where is Month
? I get AttributeError: 'module' object has no attribute 'Month'
)
I think the contructor is ok, it needs an apply (kind of like MonthEnd
has , but w/o the month end stuff).
The question remains are DateOffset/MonthOffset/YearOffset
public, or abc methods?
I think this is related as well: #4804
If we decide they are not public, they should get a _
name.
If it is public, I certainly think the constructor is not OK, the default arg should not be day
for MonthOffset
.
For the rest I can't really say if they should be public or not, I find all this offset stuff quite confusing, and the docs are rather scarce.
ok, you have highlited the issues, I don't think its hard to 'fix' them to be correct. Let's do that, then maybe can expand the docs a bit / improve doc-strings.