pandas (original) (raw)

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Conversation43 Commits14 Checks0 Files changed

Conversation

This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters

[ Show hidden characters]({{ revealButtonHref }})

EDIT2: reverted import-related clean-up due to #22730 (comment)

EDIT: after review by @WillAyd, the fixturization was split out to #22735, #22736 and #22738. Now contains only the following clean-ups, also for #22733:

Replace instances of pd. with direct imports (from review TST/CLN: series.duplicated; parametrisation; fix warning #21899 (comment)):

don’t import pd, directly import instead
Replace direct use of assert_..._equal with the tm.assert_..._equal
(from review TST/CLN: series.duplicated; parametrisation; fix warning #21899 (comment)):

use tm; don’t import assert_series_equal
Clean up usage of exp/res (from review TST/CLN: Fixturize frame/test_analytics #22733 (comment)):

Here you can use result and expected in spite of the existing code using exp
Unify references to github issues

Original post below. [END EDIT]

3 modules of TST/CLN: remove TestData from frame-tests; replace with fixtures #22471
tests added / passed
passes git diff upstream/master -u -- "*.py" | flake8 --diff

I split the commits per module, and (where possible) into fixturizing and cleaning up the imports. Other notes:

For test_apply, I added an another often-used fixture
In translating the quasi-fixtures from TestData to conftest in TST/CLN: break up & parametrize tests for df.set_index #22236, I sorted the dtypes for the columns of mixed_float_frame and mixed_int_frame, which turns out to have been a mistake. This is reverted here to be a true translation of the attribute of TestData. Otherwise, tests in the newly fixturized test_arithmetic.py would fail.

En passant, I also unified GH issue references.

This was referenced

Sep 17, 2018

Can't figure out ATM why pytest cannot find some of the fixtures. Everything works fine locally...

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's quite a few orthogonal changes mixed in here which I don't necessarily disagree with but as always it's much easier to logically separate the changes being made. Please keep this one focused on just implementing fixtures.

Cleaning up name-spacing and comments are fine but if you want to do them should be done in separate PRs dedicated to that task. If you can remove from this change will take another look

@@ -70,9 +82,10 @@ def mixed_float_frame():
Columns are ['A', 'B', 'C', 'D'].
"""
df = DataFrame(tm.getSeriesData())
df.A = df.A.astype('float16')
df.A = df.A.astype('float32')

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reasoning for changing these?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See OP:

In translating the quasi-fixtures from TestData to conftest in #22236, I sorted the dtypes for the columns of mixed_float_frame and mixed_int_frame, which turns out to have been a mistake. This is reverted here to be a true translation of the attribute of TestData. Otherwise, tests in the newly fixturized test_arithmetic.py would fail.

@@ -356,7 +356,7 @@ def test_axis_aliases(self):
assert_series_equal(result, expected)

def test_class_axis(self):
# https://github.com/pandas-dev/pandas/issues/18147
# GH 18147

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally still ask that you try to minimize your changes as much as possible. Obviously changing comments isn't going to affect any of the code base but it still makes the diffs of your PRs unnecessarily larger, which makes reviews more difficult.

You should always look for logical separation points for smaller PRs, i.e. if you really want to clean up these comments then make a separate PR for that rather than bundling here. For this change please revert this file to master

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO these are trivial changes that don't get done otherwise, and what's more I did get asked to do this in several PRs, so I just do it preemptively.

However, I separated the commits between fixturizing and import-cleaning, so you can review the separate commits (almost) as separate PRs.

result = df.apply(lambda x: x, axis=1)
assert_frame_equal(result, df)
tm.assert_frame_equal(result, df)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as before - I like the explicit namespacing and agree it would be good to make that consistent in this module, but better done in a separate, contained PR

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above as well:

IMO these are trivial changes that don't get done otherwise, [...] However, I separated the commits between fixturizing and import-cleaning, so you can review the separate commits (almost) as separate PRs.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review. I separated the commits into clearly distinct parts for easier reviewing.

@@ -70,9 +82,10 @@ def mixed_float_frame():
Columns are ['A', 'B', 'C', 'D'].
"""
df = DataFrame(tm.getSeriesData())
df.A = df.A.astype('float16')
df.A = df.A.astype('float32')

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See OP:

In translating the quasi-fixtures from TestData to conftest in #22236, I sorted the dtypes for the columns of mixed_float_frame and mixed_int_frame, which turns out to have been a mistake. This is reverted here to be a true translation of the attribute of TestData. Otherwise, tests in the newly fixturized test_arithmetic.py would fail.

@@ -356,7 +356,7 @@ def test_axis_aliases(self):
assert_series_equal(result, expected)

def test_class_axis(self):
# https://github.com/pandas-dev/pandas/issues/18147
# GH 18147

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO these are trivial changes that don't get done otherwise, and what's more I did get asked to do this in several PRs, so I just do it preemptively.

However, I separated the commits between fixturizing and import-cleaning, so you can review the separate commits (almost) as separate PRs.

result = df.apply(lambda x: x, axis=1)
assert_frame_equal(result, df)
tm.assert_frame_equal(result, df)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above as well:

IMO these are trivial changes that don't get done otherwise, [...] However, I separated the commits between fixturizing and import-cleaning, so you can review the separate commits (almost) as separate PRs.

Even if things were done in separate commits it's still makes things harder than it needs to be to have to step through them, especially as changes / edits get made over time. Again if you could do away with anything not directly relevant to "fixturization" would be very helpful and can take another look after that

This was referenced

Sep 17, 2018

@WillAyd It's something I'll take into account for future PRs. I make best-effort attempts considering the information I've gotten so far; your review contradicts that somewhat. Nevertheless, I've (grudgingly) split off #22735 and #22736.

as a general rule when changing tests

change only tests (no code)
if moving things - separate PR, don’t mix moving and changing

smaller is better - test changes have to be carefully scrutinized to make sure things are not breaking

@jreback

Now as small and focused as possible after split into #22735 #22736 #22738. Will keep this PR around for doing the import clean-ups after merge.

h-vetinari changed the title~~TST/CLN: Fixturize frame tests~~ CLN: imports etc. in frame tests

Sep 23, 2018

This is now based on #22733, #22735, #22736 and #22738. The commits are nice and modular so reviewing should be easy (once the remaining two PRs are merged).

Before we do any more of this

Replace instances of pd. with direct imports (from review #21899 (comment)):
don’t import pd, directly import instead

Replace direct use of assert_...equal with the tm.assert..._equal
(from review #21899 (comment)):
use tm; don’t import assert_series_equal

I would like alint rule to see what is what (and basically list the offenders as exclusions for now).

We have been changing things like this for a long time and I think we somtimes change them back (e.g. to use the imports of assert_*, rather than tm., and to use the pd.* rather than the direct imports.

Please remove these for now (or split out to a separate PR).

@jreback

I would like alint rule to see what is what (and basically list the offenders as exclusions for now).

There just needs to be a clear directive - also to write a lint rule. Furthermore, as far as I can tell from the test modules I've seen, almost all would fail both versions of that lint rule.

There just needs to be a clear directive - also to write a lint rule. Furthermore, as far as I can tell from the test modules I've seen, almost all would fail both versions of that lint rule.

of course, but that is exactly the point. let's see which iway is more common, then have a lint rule to enforce it, rather than ad-hoc conversions (which has been the policy up till now).

of course, but that is exactly the point. let's see which iway is more common, then have a lint rule to enforce it, rather than ad-hoc conversions (which has been the policy up till now).

Fair enough, but that's above my pay grade. Reverting the import-related commits.

can you limit the scope of this PR as it seems to overlap greating with your other one on test_analytics. I prefer PR's that change 1 thing generally; these will be reviewed an merged much faster that longer complicated ones.

@jreback

can you limit the scope of this PR as it seems to overlap greating with your other one on test_analytics.

See OP and #22730 (comment):

[...] fixturization was split out to #22735, #22736 and #22738. Now contains only the following clean-ups, also for #22733:

Better to review the parts for test_analytics and test_arithmetic after #22733 and #22736.

@WillAyd @jreback @gfyoung

Ready again to review. Also, did someone maybe change (reduce?) my GH privileges. I can't merge fixed conflicts anymore, and can't comment on a PR without simultaneously closing it.

Ready again to review. Also, did someone maybe change (reduce?) my GH privileges. I can't merge fixed conflicts anymore, and can't comment on a PR without simultaneously closing it.

hah, not possible.

I can't merge fixed conflicts anymore,

what does this mean?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tiny issue, ping on green.

# GH PR #22298
df = pd.DataFrame(np.random.normal(size=(10, 2)))
# GH 22298
df = pd. DataFrame(np.random.normal(size=(10, 2)))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

space added here

I can't merge fixed conflicts anymore,
what does this mean?

I fixed the merge conflict in the github UI, but there was no button to actually merge them. I've already restarted my browser and everything, but my "comment" button has gone (leaving only "close and comment"). Really annoying.

I can't merge fixed conflicts anymore,

what does this mean?

EDIT: and also, I can't edit posts anymore.

@h-vetinari I am surprised you use the github UI to do anything. I have no idea what controls are on that, generally much better to use the command line.

@h-vetinari I am surprised you use the github UI to do anything. I have no idea what controls are on that, generally much better to use the command line.

For tiny conflicts (handful of lines) it's quicker for me. Rest I do manually or in IDE. However, you might wanna click on a SHA hash in one of these threads sometimes, especially if there's many commits - it allows you to easily browse through the commits one by one (which is how I split up #22733).

tm9k1 pushed a commit to tm9k1/pandas that referenced this pull request

Nov 19, 2018

Labels

Clean Testing

pandas testing functions or related to the test suite