DEPR: dropping nuisance columns in DataFrame reductions by jbrockmendel · Pull Request #41480 · pandas-dev/pandas (original) (raw)
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Conversation8 Commits13 Checks0 Files changed
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters
[ Show hidden characters]({{ revealButtonHref }})
- closes #xxxx
- tests added / passed
- Ensure all linting tests pass, see here for how to run them
- whatsnew entry
Discussed on this week's call
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good. can you add a sub-section in deprecations as this is fairly user visible. ping on green.
@@ -9800,6 +9800,21 @@ def _get_data() -> DataFrame: |
---|
# Even if we are object dtype, follow numpy and return |
# float64, see test_apply_funcs_over_empty |
out = out.astype(np.float64) |
if numeric_only is None and out.shape[0] != df.shape[1]: |
# columns have been dropped |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add the issue number here and below (this PR number is fine),
This was referenced
May 17, 2021
added a whatsnew subsection. this is actually just the first half of the note im about to push for #41475.
the smart money says ive made a mess of the rst conventions regarding code-block:: ipython
vs ipython:: python
vs [...]
Deprecated Dropping Nuisance Columns in DataFrame Reductions and DataFrameGroupBy Operations |
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
When calling a reduction (.min, .max, .sum, ...) on a :class:`DataFrame` with |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When calling a reduction (.min, .max, .sum, ...) on a :class:`DataFrame` with |
---|
The default of calling a reduction (.min, .max, .sum, ...) on a :class:`DataFrame` with |
``numeric_only=None`` will silently ignore and drop from the result nuiscance columns, e.g. a string column in a .mean() reduction. |
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
---|
When calling a reduction (.min, .max, .sum, ...) on a :class:`DataFrame` with |
``numeric_only=None`` (the default, columns on which the reduction raises ``TypeError`` |
are silently ignored and dropped from the result. This behavior is deprecated. |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Start a new paragraph with 'This behavior is deprecated'
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm. can you rebase once more and ping on green.
TLouf pushed a commit to TLouf/pandas that referenced this pull request
JulianWgs pushed a commit to JulianWgs/pandas that referenced this pull request
zhengruifeng pushed a commit to apache/spark that referenced this pull request
…0 and enabling tests
What changes were proposed in this pull request?
This PR proposes to match the behavior with pandas 2.0.0 and above for stat functions, such as sum
, quantile
, prod
, etc. See pandas-dev/pandas#41480 and pandas-dev/pandas#47500 for more detail.
Why are the changes needed?
To match the behavior to latest pandas.
Does this PR introduce any user-facing change?
Yes, the behaviors for stat funcs are now matched with pandas 2.0.0 and above.
How was this patch tested?
Enabling & updating the existing UTs.
Closes #42526 from itholic/pandas_stat.
Authored-by: itholic haejoon.lee@databricks.com Signed-off-by: Ruifeng Zheng ruifengz@apache.org
valentinp17 pushed a commit to valentinp17/spark that referenced this pull request
…0 and enabling tests
What changes were proposed in this pull request?
This PR proposes to match the behavior with pandas 2.0.0 and above for stat functions, such as sum
, quantile
, prod
, etc. See pandas-dev/pandas#41480 and pandas-dev/pandas#47500 for more detail.
Why are the changes needed?
To match the behavior to latest pandas.
Does this PR introduce any user-facing change?
Yes, the behaviors for stat funcs are now matched with pandas 2.0.0 and above.
How was this patch tested?
Enabling & updating the existing UTs.
Closes apache#42526 from itholic/pandas_stat.
Authored-by: itholic haejoon.lee@databricks.com Signed-off-by: Ruifeng Zheng ruifengz@apache.org
ragnarok56 pushed a commit to ragnarok56/spark that referenced this pull request
…0 and enabling tests
What changes were proposed in this pull request?
This PR proposes to match the behavior with pandas 2.0.0 and above for stat functions, such as sum
, quantile
, prod
, etc. See pandas-dev/pandas#41480 and pandas-dev/pandas#47500 for more detail.
Why are the changes needed?
To match the behavior to latest pandas.
Does this PR introduce any user-facing change?
Yes, the behaviors for stat funcs are now matched with pandas 2.0.0 and above.
How was this patch tested?
Enabling & updating the existing UTs.
Closes apache#42526 from itholic/pandas_stat.
Authored-by: itholic haejoon.lee@databricks.com Signed-off-by: Ruifeng Zheng ruifengz@apache.org
2 participants