ENH: PeriodIndex can accept freq with mult by sinhrks · Pull Request #7832 · pandas-dev/pandas (original) (raw)
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters
[ Show hidden characters]({{ revealButtonHref }})
Closes #7811.
- Change
Period.freqandPeriodIndex.freqto store offsets.- Add
freqstrtoPeriod,PeriodIndexcan useDatetimeIndexOpsMixin's logic - Logic and tests for pickles taken in prev versions
- Add
- Perform shift/arithmetic considering freq's mult.
- Test all offsets has accessible
nproeprties (BUG: frequencies.get_freq_code raises an error against offset with n != 1 #10350)
- Test all offsets has accessible
- Explicit tests for
asfreqandto_timestampusing freq with mult. - Ops with different base freq must be prohibited. Change freq comparison to base comparison.
- Fix partial string slicing bug (some tests are commented out because of this)
- Fix order bug (BUG/API: Datetime-like Index.order reset freq #10295, removing temp logic for
PeriodIndex) Decide a policy for legacy freq aliases, like "WK"(handled in DEPR: deprecate legacy offsets #10878)- Update doc.
this is a bit odd though when you operate on these:
you are not propogating the given freq to the actual period?
In [10]: PeriodIndex(start='2014-01', freq='3M', periods=4).tolist()
Out[10]:
[Period('2014-01', 'M'),
Period('2014-04', 'M'),
Period('2014-07', 'M'),
Period('2014-10', 'M')]
In [11]: (PeriodIndex(start='2014-01', freq='3M', periods=4) + 1).tolist()
Out[11]:
[Period('2014-02', 'M'),
Period('2014-05', 'M'),
Period('2014-08', 'M'),
Period('2014-11', 'M')]
@sinhrks can you respond to my comments above?
@jreback Indeed, the frequency given to PeriodIndex is not propagated to each Period within. Do you think it should be propagated? Your earlier response indicates that you do, but I'm checking in to verify.
This unwanted behavior is also shown using period_range:
In [11]: pd.period_range('2-1-2014 00:00', '2-1-2014 3:00', freq='30Min').tolist()
Out[11]:
[Period('2014-02-01 00:00', 'T'), Period('2014-02-01 00:30', 'T'), Period('2014-02-01 01:00', 'T'), Period('2014-02-01 01:30', 'T'), Period('2014-02-01 02:00', 'T'), Period('2014-02-01 02:30', 'T'), Period('2014-02-01 03:00', 'T')]
@blbradley yes that sounds right. When the periods are created during iteration (e.g. the tolist()) then the freq should (and currently is passed).
It should also be true when this PR is implementd.
@jreback Thanks. I started trying to implement that. We'll see how it goes.
@jreback @blbradley OK. In #7811, I initially thought that created PeriodIndex doesn't need custom freq, but it is more natural. I'll consider the implementation.
Period would greatly benefit from being rewritten in Cython (probably as part of tslib) as to take advantage of convert_to_tsobject.
@blbradley this is true. But that's a whole separate project (though not THAT difficult). feel free!
@sinhrks I've been revisiting this. One problem is that Period.asfreq does not handle multiples. I think this could be handled by creating a PeriodIndex the same way as DatetimeIndex in pandas.tseries.resample.asfreq.
@sinhrks I would love to collaborate on this. Please respond soon cause I'm actively working on it.
@jreback This year sometime! I did some code spikes towards it before, but they were not fruitful.
Maybe first change Period to take multiple freqs, then revisit this. I'll look into Period code.
OK, fixed to raise when freq is not positive, and added some tests. Ready for review once again.
Maybe the whatsnew notice can give it a bit more attention? (eg an example).
Further, are there any API changes to note as well? (that freq now returns an Offset object instead of string?)
I can't spot anything else that would stop this going in.
Not sure if my comment on the docstring is wrong or it just got missed - it still states that you can't provide 5T etc
e.g., 'B' for businessday. Must be a singular rule-code (e.g. 5T is not
allowed).
@jorisvandenbossche Yes, updated whatsnew and timeseries doc.
@MaximilianR I think this has been removed from period.pyx after your suggestion. If I miss other remaining, please point the line.
@sinhrks you're right - something funky was going on with my extensions - thanks
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's put this in a separate sub-section
@sinhrks lgtm. couple of comments. merge after fix and green.
@jreback OK, fixed the release note and updated pickle/msgpack.
ok, pls rebase then merge
Rebased and now green. Merging today (Sep 3) if no further comments.
jreback added a commit that referenced this pull request
ENH: PeriodIndex can accept freq with mult
thanks @sinhrks awesome effort!
merging as I need to rebase tz on top of this
Thanks for review and merge:)