ENH: Python parser now accepts delim_whitespace=True by gfyoung · Pull Request #12958 · pandas-dev/pandas (original) (raw)

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Conversation25 Commits1 Checks0 Files changed

Conversation

This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters

[ Show hidden characters]({{ revealButtonHref }})

gfyoung

Title is self-explanatory.

@jreback

can you confirm with #12686 that all options are covered?

@jreback

I think doc-string and io.rst need small edits for this (to remove that this is only for c engine)

@gfyoung

I don't quite understand your question. Could you clarify?

Ah, yes, good point. Need to change what I just added since it's supported for Python now!

jreback

@@ -2867,6 +2841,103 @@ def test_read_only_header_no_rows(self):
df = self.read_csv(StringIO('a,b,c'), index_col=False)
tm.assert_frame_equal(df, expected)
def test_line_comment(self):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the net affect of moving all of these tests around?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better coverage. Now all of the engines are getting tested under all of the conditions for these delim_whitespace-related ones. The tests in general are arranged somewhat strangely, so at some point a refactor of the entire test suite will be needed, but not here of course 😄

@jreback

I meant we are now listing all of the options in that issue so that we can check them off. And you have them listed in the variables to check for the options, just want a cross check.

bonus points if you can autogenerate a section in the doc-string which explicity lists the unsupported options (by using those variables). Or maybe don't need to auto-generate, maybe just list a section in doc-string & io.rst to at least list the un-described / only certain engine supported options.

@gfyoung

Well, they used to be a copy of _c_parser_defaults, which I would consider to be the "correct" version until demonstrated otherwise (Python will raise if any of these are passed with a python engine specified). So in fact the list for C-only features in #12686 should be the one I created in this PR, which it is not.

@jreback

@gfyoung if you can post a correct list I will update

@gfyoung

FYI, Travis has just given the green light for my changes, so if you don't have any other code-related complaints, I'll update the documentation with [ci skip].

@gfyoung

as_recarray na_filter compact_ints use_unsigned low_memory memory_map buffer_lines error_bad_lines warn_bad_lines dtype decimal float_precision

If any of these are non-default with a python engine, the parser will raise.

@jreback

ok I edited the other issue, pls have a look

@gfyoung

Are those uncapitalised headers (e.g. "undocumented in C engine only") sub sections for "Features supported in the C engine only"? If not, all of the ones that I listed just now should also go under "Features supported in the C engine only"

jreback

@@ -101,8 +101,7 @@ delim_whitespace : boolean, default False
Specifies whether or not whitespace (e.g. ``' '`` or ``'\t'``)
will be used as the delimiter. Equivalent to setting ``sep='\+s'``.
If this option is set to True, nothing should be passed in for the
``delimiter`` parameter. This parameter is currently supported for
the C parser only.
``delimiter`` parameter.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add that this was added in 0.18.1

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gfyoung if you can update this, and lmlk when you push.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@gfyoung

@jreback

@gfyoung gfyoung deleted the delim-whitespace-python branch

April 22, 2016 20:12

2 participants

@gfyoung @jreback