ENH: Allow absolute precision in assert_almost_equal (#13357) by joaoleveiga · Pull Request #30562 · pandas-dev/pandas (original) (raw)
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Conversation75 Commits1 Checks0 Files changed
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters
[ Show hidden characters]({{ revealButtonHref }})
closes #9457
Greetings! This is my first pull-request on an open source project, so I hope I did not miss anything at least too obvious... Thank you, in advance 😄
I do have a few questions:
- Should I add the
versionadded
directive in all docstrings that were changed? - Which "whatsnew" release notes should I add this change to? v1.0.0?
- Should assert_almost_equal / equals should allow access to np.allclose #9457 also be closed, as from my understanding this was also the underlying issue?
- Should I be the one ticking the checkboxes below?
This fixes the issue where:
# Fails because it's doing (1 - .1 / .1001)
assert_almost_equal(0.1, 0.1001, check_less_precise=True)
# Works as intuitively expected
assert_almost_equal(
0.1, 0.1001,
atol=0.01,
)
Commit message below:
This commit makes assert_almost_equal
accept both relative and
absolute precision when comparing numbers, through two new keyword
arguments: rtol
, and atol
, respectively.
Under the hood, _libs.testing.assert_almost_equal
is now callingmath.isclose
, instead of an adaptaion of
numpy.testing.assert_almost_equal.
- closes assert_almost_equal #13357
- tests added / passed
- passes
black pandas
- passes
git diff upstream/master -u -- "*.py" | flake8 --diff
- whatsnew entry
Just noticed that I added the issue number on the commit title, which I guess I was not supposed to. 🤔 Can I push-force to my branch and update it?
Just noticed that I added the issue number on the commit title, which I guess I was not supposed to.
Not an issue at all, don't worry about it.
Greetings! This is my first pull-request on an open source project,
Welcome!
Which "whatsnew" release notes should I add this change to? v1.0.0?
v1.0.0, correct
Should I be the one ticking the checkboxes below?
Yes
Should #9457 also be closed, as from my understanding this was also the underlying issue?
I haven't looked at that yet, but assuming you're right, then yes. also add another checkbox with a - [x] closes #9457
Hello @joaoleveiga! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:
There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻
Comment last updated at 2020-06-21 09:44:27 UTC
Should #9457 also be closed, as from my understanding this was also the underlying issue?
I haven't looked at that yet, but assuming you're right, then yes. also add another checkbox with a
- [x] closes #9457
Looking back at this, it seems like the OP wanted a different issue solved, so I won't add the tag for that issue here.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pandas testing functions or related to the test suite
label
Thanks for the suggestion @jreback ! I was afraid of touching in too many parts of the code, but I definitely agree to having a similar API to numpy.allclose
. Let's see how deprecation works 😉
I would rather:
- deprecate check_less_precise
- add rtol and atol
similar to https://docs.scipy.org/doc/numpy/reference/generated/numpy.allclose.html with the same defaults
this will be a bit more involved, but ultimately is better than adding unfamiliar keywords
Any suggestions to make these changes without breaking current functionality?
If the rtol
and atol
parameters are added with the same defaults, then it will be tricky to check if the user is requesting the "old behavior" with check_less_precise=True/False
.
Should I worry about such a breaking change or is it OK?
Example of what I was doing:
def assert_almost_equal(
left,
right,
check_dtype: Union[bool, str] = "equiv",
check_less_precise: Union[bool, int] = False,
rtol: float = 1e-5,
atol: float = 1e-8,
**kwargs,
):
"""(...)
check_less_precise : bool or int, default False
(...)
.. deprecated:: 1.0.0
Use rtol
and atol
instead to define relative, and absolute
tolerance, respectively. Similar to :func:numpy.allclose
.
rtol : float, default 1e-5
Relative tolerance.
atol : float, default 1e-8
Absolute tolerance.
"""
...
🤔
A few options I thought of:
- With
rtol
/atol
defined asOptional[float]
one could check if the new args are defined first, and still keep thecheck_less_precise
flag working as it is now. - Ditch the
check_less_precise
functionality while keeping the argument so that the deprecation warning is more explicit.
Any ideas?
Thanks!
you can just set the default for check_less_precise = lib.no_default (grep for it)
then if check_less_precise is passed at all you warn (and then set rtol if needed)
change all refs in the pandas codebase and should be good
After looking better at how numpy handles these sort of assertions with numpy.isclose, I realized that perhaps it would be cleaner to just use it, or call math.isclose instead.
WDYT? Does it make sense to replicate math.isclose
in _libs.testing.pyx
?
This would imply deprecating the _libs.testing.decimal_almost_equal
function.
I'm going to commit the changes I made anyway, so that you can review it.
EDIT: BTW, looking at numpy/numpy#10161 has convinced me that going with the math.isclose
algorithm seems like the best option. I'm just unsure on how to deal with the deprecation, to avoid being a nuisance to people that might be using the testing.assert_*_equal
functions.
After looking better at how numpy handles these sort of assertions with numpy.isclose, I realized that perhaps it would be cleaner to just use it, or call math.isclose instead.
WDYT? Does it make sense to replicate
math.isclose
in_libs.testing.pyx
?This would imply deprecating the
_libs.testing.decimal_almost_equal
function.I'm going to commit the changes I made anyway, so that you can review it.
EDIT: BTW, looking at numpy/numpy#10161 has convinced me that going with the
math.isclose
algorithm seems like the best option. I'm just unsure on how to deal with the deprecation, to avoid being a nuisance to people that might be using thetesting.assert_*_equal
functions.
yes not averse to using math.isclose
here. I think we handle all of the NaN / different dtype stuff before then (that is of course why this was written original; numpy didn't handle anything except float/int properly), but that's where these tol's matter anyhow
Should I move the changes in the whatsnew file from the Enhancements section to the Deprecation section?
Meanwhile, I found some tests that use assert_almost_equal
are breaking (and were already breaking prior to this change...). Should I try fixing them in this issue, or open a new one?
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@joaoleveiga can you fix up merge conflicts? Someone can take a look after that
@joaoleveiga can you fix up merge conflicts? Someone can take a look after that
Conflicts solved.
Looking at the checks that failed, I was left with a few questions:
- Should I add a new test under
tests.util.test_assert_almost_equal
to test that theFutureWarning
is raised? - Should the warning issue a
FutureWarning
or aDeprecationWarning
? - Should I add the
@pytest.mark.filterwarnings("ignore:The 'check_less_precise':FutureWarning")
decoration to all tests that make use ofcheck_less_precise=
? 😬
Thanks
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Should I add a new test under tests.util.test_assert_almost_equal to test that the FutureWarning is raised?
yes
- Should the warning issue a FutureWarning or a DeprecationWarning?
FutureWarning as you have
- Should I add the @pytest.mark.filterwarnings("ignore:The 'check_less_precise':FutureWarning") decoration to all tests that make use of check_less_precise=?
NO, just change all of the usages that need changing (likely not very many)
@@ -1217,13 +1347,15 @@ def assert_frame_equal( |
---|
check_index_type="equiv", |
check_column_type="equiv", |
check_frame_type=True, |
check_less_precise=False, |
check_less_precise=no_default, |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can just use None here, no?
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. I was just following your suggestion:
you can just set the default for check_less_precise = lib.no_default (grep for it)
Which seemed a better fit when deprecating a parameter. Should I change it to None
or keep with the no_default
? 😄
Do we have internal use cases for this, or are we just/mostly providing it for downstream authors/users?
@joaoleveiga looks like a few merge conflicts have built up - can you resolve them?
@joaoleveiga looks like a few merge conflicts have built up - can you resolve them?
Done! Sorry for the delay, but these past weeks were a bit intense here (Portugal).
this was pretty close last time i looked. can you merge master
this was pretty close last time i looked. can you merge master
@jreback , just merged master. Locally I'm getting some errors on the datetime tests which seem to be somewhat related to locale definitions. Other than that, other tests passed 👌
@joaoleveiga possible typing issue: https://github.com/pandas-dev/pandas/pull/30562/checks?check_run_id=772917451#step:11:10, can you check?
Performing static analysis using mypy pandas/_testing.py:1183: error: "ExtensionArray" has no attribute "asi8"
I have looked at that and I still don't understand why it's triggering. I didn't change that piece of code.
I can add an additionalhasattr(right, "asi8")
if it's OK for you, even though it's not really related to the original issue/PR.WDYT? @TomAugspurger , @jreback
if hasattr(left, "asi8") and hasattr(right, "asi8") and type(right) == type(left):
can you repro locally? I think something likeassert isinstance(left, DatetimeLikeArrayMixin)
(and same for right) will work here
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All green @jreback. I think the docs could use some expanding, but that can be done as a followup.
Use `rtol` and `atol` instead to define relative/absolute |
---|
tolerance, respectively. Similar to :func:`math.isclose`. |
rtol : float, default 1e-5 |
Relative tolerance. |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is user-facing documentation, so these descriptions could be improved.
Could you copy the description from numpy.isclose
, or link to it with a See Also?
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm. minor doc comment & I think @TomAugspurger has some, but as a followon is ok. Thanks for doing this PR, very nice!
@@ -781,6 +781,9 @@ Deprecations |
---|
- The ``squeeze`` keyword in the ``groupby`` function is deprecated and will be removed in a future version (:issue:`32380`) |
- The ``tz`` keyword in :meth:`Period.to_timestamp` is deprecated and will be removed in a future version; use `per.to_timestamp(...).tz_localize(tz)`` instead (:issue:`34522`) |
- :meth:`DatetimeIndex.to_perioddelta` is deprecated and will be removed in a future version. Use ``index - index.to_period(freq).to_timestamp()`` instead (:issue:`34853`) |
- :meth:`util.testing.assert_almost_equal` now accepts both relative and absolute |
precision through the ``rtol``, and ``atol`` parameters, thus deprecating the |
``check_less_precise`` parameter. (:issue:`13357`). |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you also indicate #9457 here (for a followon PR is fine)
fangchenli pushed a commit to fangchenli/pandas that referenced this pull request
@joaoleveiga want to make a PR to enforce this deprecation?
@jbrockmendel , I only had time to tackle this today but I see it was already taken care of :) thanks for the heads up. Cheers
Labels
pandas testing functions or related to the test suite