Added errors{'raise','ignore'} for keys not found in meta for json_normalize by dickreuter · Pull Request #14583 · pandas-dev/pandas (original) (raw)

Conversation

This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters

[ Show hidden characters]({{ revealButtonHref }})

@dickreuter

@dickreuter dickreuter commented

Nov 4, 2016

edited by jorisvandenbossche

Loading

Continuation of #14505

Added errors{'raise','ignore'} for keys not found in meta for json_normalize

@codecov-io

Current coverage is 85.26% (diff: 100%)

Merging #14583 into master will decrease coverage by <.01%

@@ master #14583 diff @@

Files 144 144
Lines 50980 50985 +5
Methods 0 0
Messages 0 0
Branches 0 0

Powered by Codecov. Last update 14e4815...701c140

@dickreuter

Cleaning it up, should be better now. Is there a new whatsnew file so I can add the comments? What's the next version number?

@dickreuter

Why is nobody merging this?

jreback

If True, prefix records with dotted (?) path, e.g. foo.bar.field if
path to records is ['foo', 'bar']
meta_prefix : string, default None
error: {'raise', 'ignore'}, default 'raise'

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need a version in 0.20.0 tag

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need a version added tag

jreback

Bug Fixes
~~~~~~~~~
- Enhancement in ``pandas.io.json.json_normalize``If meta keys are not always present a new option to set errors='ignore' has been implemented (:issue:`14505`)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can put in enhancements section

jreback

Bug Fixes
~~~~~~~~~
- Enhancement in ``pandas.io.json.json_normalize``If meta keys are not always present a new option to set errors='ignore' has been implemented (:issue:`14505`)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

space after quotes and before If. put errors='ignore' in double-backticks.

jreback

meta_val = np.nan
else:
raise KeyError(
"Try running with errors='ignore' as the following key may not always be present: " + str(e))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this won't pass linting

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not? What do I need to change?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We check for PEP8, so this line will be too long. Normally you can set your editor to check for those things, but can also see the output of the check on travis (3d build): https://travis-ci.org/pandas-dev/pandas/jobs/174180562 (at the bottom)

pandas/io/json.py:746:80: E501 line too long (85 > 79 characters)
pandas/io/json.py:853:80: E501 line too long (129 > 79 characters)
pandas/io/tests/json/test_json_norm.py:229:5: E303 too many blank lines (2)
pandas/io/tests/json/test_json_norm.py:271:80: E501 line too long (81 > 79 characters)
pandas/io/tests/json/test_json_norm.py:272:80: E501 line too long (104 > 79 characters)
pandas/io/tests/json/test_json_norm.py:273:17: E225 missing whitespace around operator
pandas/io/tests/json/test_json_norm.py:274:10: E128 continuation line under-indented for visual indent
pandas/io/tests/json/test_json_norm.py:275:10: E128 continuation line under-indented for visual indent
pandas/io/tests/json/test_json_norm.py:276:10: E128 continuation line under-indented for visual indent
pandas/io/tests/json/test_json_norm.py:277:10: E128 continuation line under-indented for visual indent

jreback

def test_json_normalise_fix(self):
# issue 14505

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

put a comment explaining what this is. rename test -> test_json_normalize_errors

jreback

'price': {0: '0', 1: '0', 2: '0', 3: '0'},
'symbol': {0: 'AAPL', 1: 'GOOG', 2: 'AAPL', 3: 'GOOG'}}
self.assertEqual(j.fillna('').to_dict(), expected)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test the same with errors='raise'

@jreback

@dickreuter we have 107 open PR's. very few people do code review. simply ping if you want someone to look.

@jorisvandenbossche

@dickreuter just so you know: we don't get notifications when you push new code, only from comments. So it's always good to put a comment after you pushed new code to indicated that you updated. And as @jreback says, pinging like you now did always helps

@dickreuter

dickreuter commented

Nov 17, 2016

edited by jorisvandenbossche

Loading

Ok great. Please have a look. I think this can now be merged.

@jorisvandenbossche

@dickreuter Did you push the changes? As there is still a merge conflict and there is no new commit since @jreback 's comments

@dickreuter

Just pushed the changes as instructed.

@dickreuter

Anything else I need to fix in this or can this be merged?

@dickreuter

Just did another rebase and pushed again. Can this be merged now? thanks

jreback

^^^^^^^^^^^^^^^^^^
- ``pd.read_excel`` now preserves sheet order when using ``sheetname=None`` (:issue:`9930`)
- ``pandas.io.json.json_normalize`` If meta keys are not always present a new option to set errors="ignore" (:issue:`14583`)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In pandas.io.json.json_normalize gained the option errors='ignore'|raise (in double-backticks); the default is raise and is backward compatible.

jreback

.. _whatsnew_0200.bug_fixes:
Bug Fixes
~~~~~~~~~

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

revert this

jreback

path to records is ['foo', 'bar']
meta_prefix : string, default None
error: {'raise', 'ignore'}, default 'raise'
* ignore: will ignore keyErrors if keys listed in meta are not always present

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add an entry for raise

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will ignore KeyError, if the keys passed in meta are not all present

jreback

meta_val = np.nan
else:
raise \
KeyError(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better comment please

jreback

def test_json_normalize_errors(self):
# If meta keys are not always present a new option to set errors='ignore' has been implemented (:issue:`14583`)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GH14583

@jreback

just some doc comments, otherwise lgtm.

@dickreuter

Implemented doc amendments as suggested. Let me know if anything else needs to be changed. thanks

jorisvandenbossche

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some PEP8 style issues, otherwise looks good!

meta_prefix : string, default None
error: {'raise', 'ignore'}, default 'raise'
* ignore: will ignore KeyError if keys listed in meta are not always present
* raise: will raise KeyError if keys listed in meta are not always present

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect those two lines above are too long (> 80, we check for PEP8, which will let travis fail)

if errors == 'ignore':
meta_val = np.nan
else:
raise KeyError("Try running with errors='ignore' as key %s is not always present.", e)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here, for PEP8 line too long

def test_json_normalize_errors(self):
# GH14583: If meta keys are not always present a new option to set errors='ignore' has been implemented

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here and in other places below line length is also a problem

@jorisvandenbossche

@dickreuter You still have some style issues (that's the reason that travis is failing). You can always look in the log of the third travis build to see the issues (https://travis-ci.org/pandas-dev/pandas/jobs/182143127, scroll to the bottom):

Linting  *.py
pandas/io/json.py:750:1: W293 blank line contains whitespace
pandas/io/json.py:857:80: E501 line too long (81 > 79 characters)
pandas/io/json.py:858:80: E501 line too long (85 > 79 characters)
pandas/io/tests/json/test_json_norm.py:229:5: E303 too many blank lines (2)
pandas/io/tests/json/test_json_norm.py:266:80: E501 line too long (81 > 79 characters)
pandas/io/tests/json/test_json_norm.py:267:80: E501 line too long (87 > 79 characters)
pandas/io/tests/json/test_json_norm.py:269:17: E225 missing whitespace around operator
pandas/io/tests/json/test_json_norm.py:270:10: E128 continuation line under-indented for visual indent
pandas/io/tests/json/test_json_norm.py:271:10: E128 continuation line under-indented for visual indent
pandas/io/tests/json/test_json_norm.py:272:10: E128 continuation line under-indented for visual indent
pandas/io/tests/json/test_json_norm.py:273:10: E128 continuation line under-indented for visual indent
pandas/io/tests/json/test_json_norm.py:278:17: E128 continuation line under-indented for visual indent
pandas/io/tests/json/test_json_norm.py:278:80: E501 line too long (86 > 79 characters)
pandas/io/tests/json/test_json_norm.py:279:28: E127 continuation line over-indented for visual indent
pandas/io/tests/json/test_json_norm.py:279:80: E501 line too long (87 > 79 characters)
Linting *.py DONE

(it also a good idea to set up your IDE to warn for this)

@dickreuter

I believe this is fixed now. Some checks are still not successful but I can't find any reason why. Any suggestions why? All it says is:

The command "ci/lint.sh" exited with 1.
0.00s$ echo "script done"
script done

@jreback

git diff master | flake8 -diff

@jorisvandenbossche

@dickreuter

Linting still seems to fail, even though when I do "git diff master | flake8 --diff" I only get problems in v0.20.0.txt, all the rest is fine. Any suggestions how I can find the remaining issues?

@jorisvandenbossche

@dickreuter It was not linting this time, the mac build failed.

@dickreuter

And it is related to my code? I can't find any details.

@jreback

@dickreuter you need to rebase on master

git rebase -i origin/master then
git push -f

dickreuter added 5 commits

December 13, 2016 21:50

… meta parameter that don't always occur in every item of the list

Added documentation and test for issue #14505 Added keyword errors {'raise'|'ignore} Shortened what's new Removed commas in dictionary for linting compatibility Updated doc

@dickreuter

Now all checked have passed.

@jreback

ischurov pushed a commit to ischurov/pandas that referenced this pull request

Dec 19, 2016

@ischurov

…on_normalize

Author: dickreuter dickreuter@yahoo.com

Closes pandas-dev#14583 from dickreuter/json_normalize_enhancement and squashes the following commits:

701c140 [dickreuter] adjusted formatting 3c94206 [dickreuter] shortened lines to pass linting 2028924 [dickreuter] doc changes d298588 [dickreuter] Fixed as instructed in pull request page bcfbf18 [dickreuter] Avoids exception when pandas.io.json.json_normalize

Labels