[BUG]: Fix regression in read_table with delim_whitespace=True by phofl · Pull Request #36560 · pandas-dev/pandas (original) (raw)

Conversation

This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters

[ Show hidden characters]({{ revealButtonHref }})

@phofl

We could write a private function which is called from read_csv and read_table with the appropriate default_set alternatively.

Edit: Black_pandas formats files not touched by myself. Was there an update or change of guidelines?

@phofl

@phofl phofl added IO Data

IO issues that don't fit into a more specific label

Regression

Functionality that used to work in a prior pandas version

labels

Sep 22, 2020

tacaswell

@phofl

@phofl

Test failure seems unrelated

@dsaxton

@phofl Looks like a whatsnew conflict if you can merge master

@phofl

� Conflicts: � doc/source/whatsnew/v1.1.3.rst

@phofl

dsaxton

simonjayhawkins

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@phofl

@simonjayhawkins

Yes, it is correct, that this does not raise an error. But this is not a regression. This combination does not raise an error on 1.0.5. That would be a regular bugfix.

Should I add this here or make a separate PR, which won't be backported to 1.1.3?

@phofl @dsaxton

Co-authored-by: Daniel Saxton 2658661+dsaxton@users.noreply.github.com

@tacaswell

Following up on the comments it #36381 , I agree that the behavior still does not match the docstring. I would change the docstring to be "pass the default", not "pass nothing" which is both easier (no code changes!) and clearer (I can think of a defend-able position that '' and None are both "nothing").

@phofl

That is probably a better solution. I would add this here if this is ok @simonjayhawkins ?

@simonjayhawkins

@simonjayhawkins

Yes, it is correct, that this does not raise an error. But this is not a regression. This combination does not raise an error on 1.0.5. That would be a regular bugfix.

Should I add this here or make a separate PR, which won't be backported to 1.1.3?

fair point. can either fix here or raise an issue for this. Not raising also applies to read_csv, so if done here should probably include read_csv and would probably need a second release note.

I think checking if sep has been passed can be done fairly easily with lib.no_default. If done in a separate PR could also be backported as a bugfix.

simonjayhawkins

@simonjayhawkins

🤔 I've just re-read the docs If this option is set to True, nothing should be passed in for the delimiter parameter. so not raising is a bug. but otoh if it is not raising, and we start raising we would normally put something in the api breaking changes section of the release notes. since this is targeted for a patch release, we should probably not include anything considered an api breaking change.

so ignore my previous comments except about adding a comment to the code for the next reader.

@phofl

Ok, will do that. I will open a issue about fixing this for 1.2?

@phofl

@phofl

@phofl

@phofl

@phofl phofl mentioned this pull request

Sep 23, 2020

3 tasks

@phofl

simonjayhawkins

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @phofl generally lgtm pending green

@phofl

@phofl

@simonjayhawkins

@phofl

That seems to be an issue with the Pipeline. It fails consistently through the PRs

@simonjayhawkins

That seems to be an issue with the Pipeline. It fails consistently through the PRs

Thanks. It's was failing intermittently on master and constantly on 1.1.x. I see it's now constantly failing on master too.

@simonjayhawkins

@simonjayhawkins

😕 merged master and still arm failures. restarted.

@phofl

@simonjayhawkins

it's timing out. happening on several PRs. no idea why.

@jreback

@jreback

this looks fine, just resolved conflicts.

cc @gfyoung ok here?

gfyoung

@simonjayhawkins

@simonjayhawkins

meeseeksmachine pushed a commit to meeseeksmachine/pandas that referenced this pull request

Sep 26, 2020

@phofl @meeseeksmachine

simonjayhawkins pushed a commit that referenced this pull request

Sep 26, 2020

@meeseeksmachine @phofl

…itespace=True (#36661)

Co-authored-by: patrick 61934744+phofl@users.noreply.github.com

kesmit13 pushed a commit to kesmit13/pandas that referenced this pull request

Nov 2, 2020

@phofl

Labels

IO Data

IO issues that don't fit into a more specific label

Regression

Functionality that used to work in a prior pandas version