[BUG]: Fix regression in read_table with delim_whitespace=True by phofl · Pull Request #36560 · pandas-dev/pandas (original) (raw)

Conversation

This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters

[ Show hidden characters]({{ revealButtonHref }})

closes BUG: read_table raises ValueError when delim_whitespace is set to True #35958
tests added / passed
passes black pandas
passes git diff upstream/master -u -- "*.py" | flake8 --diff
whatsnew entry

We could write a private function which is called from read_csv and read_table with the appropriate default_set alternatively.

Edit: Black_pandas formats files not touched by myself. Was there an update or change of guidelines?

phofl added IO Data

IO issues that don't fit into a more specific label

Regression

Functionality that used to work in a prior pandas version

labels

Sep 22, 2020

Test failure seems unrelated

@phofl Looks like a whatsnew conflict if you can merge master

� Conflicts: � doc/source/whatsnew/v1.1.3.rst

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@simonjayhawkins

Yes, it is correct, that this does not raise an error. But this is not a regression. This combination does not raise an error on 1.0.5. That would be a regular bugfix.

Should I add this here or make a separate PR, which won't be backported to 1.1.3?

Co-authored-by: Daniel Saxton 2658661+dsaxton@users.noreply.github.com

Following up on the comments it #36381 , I agree that the behavior still does not match the docstring. I would change the docstring to be "pass the default", not "pass nothing" which is both easier (no code changes!) and clearer (I can think of a defend-able position that '' and None are both "nothing").

That is probably a better solution. I would add this here if this is ok @simonjayhawkins ?

@simonjayhawkins

Yes, it is correct, that this does not raise an error. But this is not a regression. This combination does not raise an error on 1.0.5. That would be a regular bugfix.

Should I add this here or make a separate PR, which won't be backported to 1.1.3?

fair point. can either fix here or raise an issue for this. Not raising also applies to read_csv, so if done here should probably include read_csv and would probably need a second release note.

I think checking if sep has been passed can be done fairly easily with lib.no_default. If done in a separate PR could also be backported as a bugfix.

🤔 I've just re-read the docs If this option is set to True, nothing should be passed in for the delimiter parameter. so not raising is a bug. but otoh if it is not raising, and we start raising we would normally put something in the api breaking changes section of the release notes. since this is targeted for a patch release, we should probably not include anything considered an api breaking change.

so ignore my previous comments except about adding a comment to the code for the next reader.

Ok, will do that. I will open a issue about fixing this for 1.2?

phofl mentioned this pull request

Sep 23, 2020

3 tasks