BUG: to_datetime with M or Y unit and non-round float by jbrockmendel · Pull Request #50301 · pandas-dev/pandas (original) (raw)

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Conversation22 Commits10 Checks0 Files changed

Conversation

This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters

[ Show hidden characters]({{ revealButtonHref }})

closes #xxxx (Replace xxxx with the GitHub issue number)
Tests added and passed if fixing a bug or adding a new feature
All code checks passed.
Added type annotations to new arguments/methods/functions.
Added an entry in the latest doc/source/whatsnew/vX.X.X.rst file if fixing a bug or adding a new feature.

If non-round values aren't allowed, would this supersede #50183?

If non-round values aren't allowed, would this supersede #50183?

Its only for unit="M" or unit="Y"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

couple of minor comments, other than that looks good

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines +346 to +356

if is_ym and not fval.is_integer():
# Analogous to GH#47266 for Timestamp
if is_raise:
raise ValueError(
f"Conversion of non-round float with unit={unit} "
"is ambiguous"
)
elif is_ignore:
raise AssertionError
iresult[i] = NPY_NAT
continue

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry to block this, but is there a test that hits this?

I can see tests when val is a non-round float, but what about when it's a string containing a non-round float (e.g. '1.5'), which I believe is what would reach this branch?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rebecca-palmer added a commit to rebecca-palmer/pandas that referenced this pull request

Jan 18, 2023

mroeschke pushed a commit that referenced this pull request

Jan 20, 2023

add test for float to_datetime near overflow bounds
fix float to_datetime near overflow bounds
fix typo and formatting
fix formatting
fix test to not fail on rounding differences
don't use approximate comparison on datetimes, it doesn't work
also can't convert datetime to float
match dtypes
TST: don't try to use non-integer years (see #50301)
TST: don't cross an integer

(tsmax_in_days happens to be close to an integer, and this is a test of rounding)

PERF: remove unnecessary copy
add whatsnew

pooja-subramaniam pushed a commit to pooja-subramaniam/pandas that referenced this pull request

Jan 25, 2023

add test for float to_datetime near overflow bounds
fix float to_datetime near overflow bounds
fix typo and formatting
fix formatting
fix test to not fail on rounding differences
don't use approximate comparison on datetimes, it doesn't work
also can't convert datetime to float
match dtypes
TST: don't try to use non-integer years (see pandas-dev#50301)
TST: don't cross an integer

(tsmax_in_days happens to be close to an integer, and this is a test of rounding)

PERF: remove unnecessary copy
add whatsnew

QuLogic added a commit to QuLogic/pandas that referenced this pull request

Feb 21, 2024

In Pandas 1.5.3, the float(val) cast was inline to the cast_from_unit call in array_with_unit_to_datetime. This caused the intermediate (unnamed) value to be a Python float.

Since pandas-dev#50301, an temporary variable was added to avoid multiple casts, but with explicit type cdef float, which defines a Cython float. This type is 32-bit, and causes a loss of precision, and a regression in parsing from 1.5.3.

Since cast_from_unit takes an object, not a more specific Cython type, remove the explicit type from the temporary fval variable entirely. This will cause it to be a (64-bit) Python float, and thus not lose precision.

Fixes pandas-dev#57051

QuLogic added a commit to QuLogic/pandas that referenced this pull request

Feb 21, 2024

In Pandas 1.5.3, the float(val) cast was inline to the cast_from_unit call in array_with_unit_to_datetime. This caused the intermediate (unnamed) value to be a Python float.

Since pandas-dev#50301, a temporary variable was added to avoid multiple casts, but with explicit type cdef float, which defines a Cython float. This type is 32-bit, and causes a loss of precision, and a regression in parsing from 1.5.3.

Fixes pandas-dev#57051

QuLogic added a commit to QuLogic/pandas that referenced this pull request

Feb 22, 2024

In Pandas 1.5.3, the float(val) cast was inline to the cast_from_unit call in array_with_unit_to_datetime. This caused the intermediate (unnamed) value to be a Python float.

So widen the explicit type of the temporary fval variable to (64-bit) double, which will not lose precision.

Fixes pandas-dev#57051

QuLogic added a commit to QuLogic/pandas that referenced this pull request

Feb 22, 2024

In Pandas 1.5.3, the float(val) cast was inline to the cast_from_unit call in array_with_unit_to_datetime. This caused the intermediate (unnamed) value to be a Python float.

So widen the explicit type of the temporary fval variable to (64-bit) double, which will not lose precision.

Fixes pandas-dev#57051

QuLogic added a commit to QuLogic/pandas that referenced this pull request

Mar 27, 2024

In Pandas 1.5.3, the float(val) cast was inline to the cast_from_unit call in array_with_unit_to_datetime. This caused the intermediate (unnamed) value to be a Python float.

So widen the explicit type of the temporary fval variable to (64-bit) double, which will not lose precision.

Fixes pandas-dev#57051

mroeschke pushed a commit that referenced this pull request

Mar 27, 2024

In Pandas 1.5.3, the float(val) cast was inline to the cast_from_unit call in array_with_unit_to_datetime. This caused the intermediate (unnamed) value to be a Python float.

Since #50301, a temporary variable was added to avoid multiple casts, but with explicit type cdef float, which defines a Cython float. This type is 32-bit, and causes a loss of precision, and a regression in parsing from 1.5.3.

So widen the explicit type of the temporary fval variable to (64-bit) double, which will not lose precision.

Fixes #57051

pmhatre1 pushed a commit to pmhatre1/pandas-pmhatre1 that referenced this pull request

May 7, 2024

…as-dev#57548)

In Pandas 1.5.3, the float(val) cast was inline to the cast_from_unit call in array_with_unit_to_datetime. This caused the intermediate (unnamed) value to be a Python float.

So widen the explicit type of the temporary fval variable to (64-bit) double, which will not lose precision.

Fixes pandas-dev#57051