BUG: fix tzaware dataframe transpose bug by jbrockmendel · Pull Request #26825 · pandas-dev/pandas (original) (raw)

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Conversation38 Commits23 Checks0 Files changed

Conversation

This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters

[ Show hidden characters]({{ revealButtonHref }})

This allows us to get rid of a bunch of xfails and FIXMEs in arithmetic tests. Changes in groupby are the most likely to need attention; this is not an area of the code I know well.

Not sure if TestTranspose belongs somewhere else. Suggestions welcome.

Split a couple of too-widely-scoped groupby tests.

Will follow-up with issues this closes, whatsnew entry, and GH references added to tests.

closes BUG: Pandas cannot create DataFrame from Numpy Array of TimeStamps #13287
tests added / passed
passes git diff upstream/master -u -- "*.py" | flake8 --diff
whatsnew entry

@@ -160,7 +160,29 @@ def init_ndarray(values, index, columns, dtype=None, copy=False):
# on the entire block; this is to convert if we have datetimelike's
# embedded in an object type
if dtype is None and is_object_dtype(values):
values = maybe_infer_to_datetimelike(values)

if values.ndim == 2 and values.shape[0] != 1:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is much more messy, can we change something else to make this nicer?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really. I'm looking into the other places where maybe_infer_to_datetimelike is used to see if some of this can go into that. We could separate this whole block into a dedicated function. But one way or another we need to bite the bullet.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so the inside of list loop should be in pandas.core.dtypes.cast, no? (obviously up until you make the blocks themselves)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to leave this for the next pass when I'm taking a more systematic look at maybe_infer_to_datetimelike

Updated with requested edit, GH references in tests, and whatsnew.

@@ -1664,3 +1655,36 @@ def _normalize_keyword_aggregation(kwargs):
order.append((column,
com.get_callable_name(aggfunc) or aggfunc))
return aggspec, columns, order


def _recast_datetimelike_result(result):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so i would put this in pandas.core.dtypes.cast, our dumping ground for casting, right after / before maybe_infer_to-datetimelike (and you can de-privatize)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really kludgy code (and is replacing equally kludgy code; kind of like turtles all the way down). I'd rather keep it close to its only usage and hope it is ripped out by someone who knows this part of the code better

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as i said this look a lot like maybe_convert_objects, try to use that here

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note also: we only have 5 tests that go through this path


Notes
-----
- Assumes Groupby._selected_obj has ndim==2 and at least one

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this note doesn't seem relevant as you are passing in frame right?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That wasn't obvious to me bc were talking about the dimensions of two separate objects. Are they necessarily the same?

@@ -160,7 +160,29 @@ def init_ndarray(values, index, columns, dtype=None, copy=False):
# on the entire block; this is to convert if we have datetimelike's
# embedded in an object type
if dtype is None and is_object_dtype(values):
values = maybe_infer_to_datetimelike(values)

if values.ndim == 2 and values.shape[0] != 1:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so the inside of list loop should be in pandas.core.dtypes.cast, no? (obviously up until you make the blocks themselves)


from pandas.core.internals.blocks import make_block

# TODO: What about re-joining object columns?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls reuse the block creation routines below

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

attempts so far have broken everything. do you have a particular routine in mind?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what I mean is you can remove the create_block_manager_from_blocks and let it fall thru to 184 with I think a very small change, e.g.


if .....
     blocks = bvals
else:
     dvals = .......
      blocks = [dvals]

of course pls use a longer name than dvals

# unnecessary if we ever allow 2D DatetimeArray

dvals_list = [maybe_infer_to_datetimelike(values[n, :])
for n in range(len(values))]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make this a list comprehension & use enumerate if you must., this is very hard to read.

haven’t had a chance to relook


from pandas.core.internals.blocks import make_block

# TODO: What about re-joining object columns?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what I mean is you can remove the create_block_manager_from_blocks and let it fall thru to 184 with I think a very small change, e.g.


if .....
     blocks = bvals
else:
     dvals = .......
      blocks = [dvals]

of course pls use a longer name than dvals

I think comments have been addressed.

This was referenced

Jan 19, 2021