BUG: fix Period.asfreq conversion near datetime(1, 1, 1) by jbrockmendel · Pull Request #19650 · pandas-dev/pandas (original) (raw)

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Conversation

This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters

[ Show hidden characters]({{ revealButtonHref }})

jbrockmendel

@jorisvandenbossche

Is there a reason you moved the tests?

@jbrockmendel

Is there a reason you moved the tests?

Ongoing process of centralizing and de-duplicating tests. The moved tests cover code near the bug being fixed, so it seemed close enough to in-scope.

@jbrockmendel

@codecov

@jorisvandenbossche

It feels you are rather spreading the Period tests to me? This are tests for the public Period class, and tests related to our scalar classes (Timestamp, Period, Timedelta, Interval) are centralized in the /tests/scalar/ directory. My preference is to keep it that way.
(I think the tslibs tests are currently tests on functionality contained in tslibs we want to test but is not public?)

@jreback

lets create pandas/tests/scalar/period and move these tests to freq.py

jreback

@@ -154,12 +154,30 @@ cdef inline int get_freq_group(int freq) nogil:
return (freq // 1000) * 1000
@cython.cdivision
@cython.cdivision(False) # specifically _dont_ use cdvision GH#19643

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a comment on why

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rather does this actually matter now that you are always using floor division?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rather does this actually matter now that you are always using floor division?

Yes. With cdivision -1 // 7 returns 0. I find it really counter-intuitive, not at all surprising that this bug slipped through.

Parameters
----------
dinfo : date_info*
absdate : int64_t (as returned from get_python_ordinal or absdate_from_ymd)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not a great comment (nor does it follow doc-string conventions)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will change.

Parameters
----------
dinfo : date_info*

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pointer to

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's what the asterisk is for (this notation is used in docstrings elsewhere). Have a preferred syntax?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, you are removing this anyhow IIRC? (in follow on)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

Parameters
----------
dinfo : date_info*
abstime : double

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you be more specific here , e.g. what is abstime

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The next line seconds elapsed since beginning of day described by absdate isn't clear?

Returns
-------
code : int (always 0)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

success / error

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's actually a dummy; there is no error-state. I guess just make it return None?

@jbrockmendel

re test location: will change.

@jbrockmendel

@jbrockmendel

@jorisvandenbossche jorisvandenbossche changed the titleFix asfreq conversion BUG: fix Period.asfreq conversion near datetime(1, 1, 1)

Feb 12, 2018

jreback

class TestPeriodFreqConversion(object):
@pytest.mark.parametrize('freq', ['A', 'Q', 'M', 'W', 'B', 'D'])
def test_asfreq_near_zero(self, freq):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a whatsnew note for this
nvm saw that

tup2 = (prev.year, prev.month, prev.day)
assert tup2 < tup1
@pytest.mark.xfail

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a reason

@jreback

small comment. ping on green. ok with moving tests in another PR (to a period scalar sub-module) as discussed.

@jbrockmendel

@jbrockmendel

@jreback

The following asvs benchmarks (if any) failed.
[ 43.09%] ··· Running io.excel.Excel.time_read_excel                  1/3 failed
                   xlwt      failed 

its possible this actually broke this asv

@jbrockmendel

its possible this actually broke this asv

Is there a test this corresponds to?

@jbrockmendel

It executes fine locally, will run asv on it to see if the timing is massively changed.

@jbrockmendel

@jbrockmendel

asv runs fine on both 3.5 and 2.7

@jbrockmendel

@jbrockmendel

Thoughts on this? I assumed the asv concern superseded the "ping on green"

jreback

Parameters
----------
dinfo : date_info*

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, you are removing this anyhow IIRC? (in follow on)

abstime : double
seconds elapsed since beginning of day described by absdate
Returns

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should no longer be needed. IOW this is the point of an errors declaration (and raising)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, will be removing the leftover-from-C bits (and camelCase) in the next pass. For now this PR is focused on just the one bug.

Parameters
----------
dinfo : date_info*
abstime : double

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here

@@ -336,7 +407,7 @@ cdef int dInfoCalc_SetFromAbsTime(date_info *dinfo, double abstime) nogil:
dinfo.hour = hour
dinfo.minute = minute
dinfo.second = second
return 0

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do you need a return?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't. Will remove when killing this function off.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just remove the return

jreback

@@ -336,7 +407,7 @@ cdef int dInfoCalc_SetFromAbsTime(date_info *dinfo, double abstime) nogil:
dinfo.hour = hour
dinfo.minute = minute
dinfo.second = second
return 0

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just remove the return

abstime : double
seconds elapsed since beginning of day described by absdate
Returns

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't have Returns for None, remove

Returns
-------
absdate : int days elapsed since datetime(1, 1, 1)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doc-string format, put the expl on the next line

Returns
-------
qtr_freq : int describing the implied quarterly frequency

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

Parameters
----------
ordinal : int64_t

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very odd that these take pointers

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this point these functions are translated as directly as possible from the C versions. In future passes I'd like to make them more pythonic too, will need to be careful about perf.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that’s fine

@@ -1,10 +1,32 @@
# -*- coding: utf-8 -*-
import pytest
from pandas.errors import OutOfBoundsDatetime
from pandas._libs.tslibs.frequencies import get_freq

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would not be averse to makeing this a submodule in future
tests/tslibls/period/test_asfreq

@jbrockmendel

@jbrockmendel

@jbrockmendel

I think all the requests have been hit.

@jorisvandenbossche

Is there a reason you are adding the test in /tests/tslibs/test_period_asfreq.py and not in tests/scalar/test_period_asfreq.py ?

@jbrockmendel

Is there a reason you are adding the test in /tests/tslibs/test_period_asfreq.py and not in tests/scalar/test_period_asfreq.py ?

test.tslibs is intended for testing tslibs in isolation. These asfreq tests are largely for period_helper (or were until recently) and so are self-contained. This is the right long-term home for these tests.

@jorisvandenbossche

test.tslibs is intended for testing tslibs in isolation.

Yes, and the tests you added are testing behaviour of the Period class (not a private utility of tslib), which for the rest is tested in tests/scalar/test_period(_asfreq).py ?

@jreback

@jorisvandenbossche I asked @jbrockmendel mendel to move these. not quite happy where the scalar tests are ATM, now whether actualy in tslib or the scalar/period is a matter of opinion here. I agree if these are actually user facing then they should be in the scalar/period/....

however as that doesn't exist yet (well in a minute it will), it was a bit confusing.

@jorisvandenbossche

however as that doesn't exist yet (well in a minute it will)

tests/scalar/test_period_asfreq.py does actually already exist

agree if these are actually user facing then they should be in the scalar/period/....

This is the case here

We can reorganize tests, but let's do that in a separate PR, and add the tests here in the location where similar tests already are? (which is AFAIK tests/scalar/test_period_asfreq.py)

@jreback

@jorisvandenbossche you will see shortly. the test reorg has been going on for quite a while and is moving in the right direction.

jreback

tup1 = (per.year, per.hour, per.day)
prev = per - 1
assert (per - 1).ordinal == per.ordinal - 1

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think as a general rule if you are importing Period, then it should go in pandas/tests/scalar/period/* else it can go here (e.g. you are testing a helper function)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seeing as how the recent addition of tests.scalar.period will necessitate a new look at this, I propose we merge this bug fix and the next pass (later today) can attempt to standardize this convention.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can also simply move the tests there now (as travis has to run again anyhow since you just added a commit)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you update for this comment

@jbrockmendel

@jreback

lgtm. I changed the whatsnew slightly, so needs a rebase. ping on green. can fixup period tests after.

jreback

@jbrockmendel

@jbrockmendel

jreback

tup1 = (per.year, per.hour, per.day)
prev = per - 1
assert (per - 1).ordinal == per.ordinal - 1

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you update for this comment

@jbrockmendel

@jbrockmendel

@jorisvandenbossche

harisbal pushed a commit to harisbal/pandas that referenced this pull request

Feb 28, 2018

@jbrockmendel