[BUG]: Fix regression in read_table with delim_whitespace=True by phofl · Pull Request #36560 · pandas-dev/pandas (original) (raw)
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters
[ Show hidden characters]({{ revealButtonHref }})
- closes BUG: read_table raises ValueError when delim_whitespace is set to True #35958
- tests added / passed
- passes
black pandas - passes
git diff upstream/master -u -- "*.py" | flake8 --diff - whatsnew entry
We could write a private function which is called from read_csv and read_table with the appropriate default_set alternatively.
Edit: Black_pandas formats files not touched by myself. Was there an update or change of guidelines?
IO issues that don't fit into a more specific label
Functionality that used to work in a prior pandas version
labels
Test failure seems unrelated
@phofl Looks like a whatsnew conflict if you can merge master
� Conflicts: � doc/source/whatsnew/v1.1.3.rst
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is correct, that this does not raise an error. But this is not a regression. This combination does not raise an error on 1.0.5. That would be a regular bugfix.
Should I add this here or make a separate PR, which won't be backported to 1.1.3?
Co-authored-by: Daniel Saxton 2658661+dsaxton@users.noreply.github.com
Following up on the comments it #36381 , I agree that the behavior still does not match the docstring. I would change the docstring to be "pass the default", not "pass nothing" which is both easier (no code changes!) and clearer (I can think of a defend-able position that '' and None are both "nothing").
That is probably a better solution. I would add this here if this is ok @simonjayhawkins ?
Yes, it is correct, that this does not raise an error. But this is not a regression. This combination does not raise an error on 1.0.5. That would be a regular bugfix.
Should I add this here or make a separate PR, which won't be backported to 1.1.3?
fair point. can either fix here or raise an issue for this. Not raising also applies to read_csv, so if done here should probably include read_csv and would probably need a second release note.
I think checking if sep has been passed can be done fairly easily with lib.no_default. If done in a separate PR could also be backported as a bugfix.
🤔 I've just re-read the docs If this option is set to True, nothing should be passed in for the delimiter parameter. so not raising is a bug. but otoh if it is not raising, and we start raising we would normally put something in the api breaking changes section of the release notes. since this is targeted for a patch release, we should probably not include anything considered an api breaking change.
so ignore my previous comments except about adding a comment to the code for the next reader.
Ok, will do that. I will open a issue about fixing this for 1.2?
phofl mentioned this pull request
3 tasks
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @phofl generally lgtm pending green
That seems to be an issue with the Pipeline. It fails consistently through the PRs
That seems to be an issue with the Pipeline. It fails consistently through the PRs
Thanks. It's was failing intermittently on master and constantly on 1.1.x. I see it's now constantly failing on master too.
😕 merged master and still arm failures. restarted.
it's timing out. happening on several PRs. no idea why.
this looks fine, just resolved conflicts.
cc @gfyoung ok here?
meeseeksmachine pushed a commit to meeseeksmachine/pandas that referenced this pull request
simonjayhawkins pushed a commit that referenced this pull request
…itespace=True (#36661)
Co-authored-by: patrick 61934744+phofl@users.noreply.github.com
kesmit13 pushed a commit to kesmit13/pandas that referenced this pull request
Labels
IO issues that don't fit into a more specific label
Functionality that used to work in a prior pandas version