BUG: Respect 'usecols' parameter even when CSV rows are uneven by gfyoung · Pull Request #12551 · pandas-dev/pandas (original) (raw)
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Conversation23 Commits1 Checks0 Files changed
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters
[ Show hidden characters]({{ revealButtonHref }})
Closes #12203 by overriding the row alignment checks for both engines when the usecols
parameter is passed into read_csv
.
@gfyoung its not necessary for you to keep rebasing every time master is slightly updated.
@jreback : Oh whoops, sorry! I have a small script that updates my branches (including master
in my fork) whenever I see there are commits, and I forgot to specify to just update the master
branch.
oh ok maybe just do it manually for a while
trying to get out another rc
the issue is that Travis runs and it can get very busy
No, I perfectly understand. As a precaution, I'll add [ci skip]
to all of my open PR's, since I imagine there will be changes to be made once reviewed after which I'll remove it so that Travis can run.
# make sure that an error is still thrown |
---|
# when the 'usecols' parameter is not provided |
name = self.__class__.__name__ |
msg = "Expected \d+ fields in line \d+, saw \d+" |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is too much gymnastics here. Why is this needed? e.g. diffs in the errors from python/c parser? if that IS the case. then I'd like to change the raise for this kind of error to a ValueError
in the cparser.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well that was why I had initially created a PythonParserError
to keep it consistent with what you would see with the C engine. ValueError
though makes sense for consistency and semantic purposes. Will change.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, looking at where the CParserError
is raised, it's buried in the tokenizer.c
code, which is supposed to be for parsing errors in general. Suggestions on how to externalize this particular CParserError
?
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no its in parser.pyx
(but it inherits from ValueError
). I would actually change that, but that might cause some issues (you can check that soln though)
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, right, sorry! Do you mean just getting rid of CParserError
altogether and replacing with just ValueError
?
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you wont' be able to remove it (well shouldn't as its in the public API). But it inherites from Exception
. if you can make it inherit from ValueError
then I think we'd take that. See what happens. lmk.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough. That seems like a good compromise because ValueError
inherits from Exception
anyways. Will give it a shot.
@jreback : Travis has no problems making CParserError
inherit from ValueError
, so I think this is also good to merge if there is nothing else.
need s note in the API changes section wrt error class
Added note in the API changes. Should be good to merge if there is nothing else.
@@ -50,6 +50,12 @@ API changes |
---|
- ``CParserError`` is now a ``ValueError`` instead of just an ``Exception`` |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add the issue number for the PR here
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@jreback : Made all of the requested changes. Should be ready to merge if there is nothing else.
gfyoung deleted the usecol_long_lines branch
This was referenced
Mar 20, 2016
2 participants