Added errors{'raise','ignore'} for keys not found in meta for json_normalize by dickreuter · Pull Request #14583 · pandas-dev/pandas (original) (raw)
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters
[ Show hidden characters]({{ revealButtonHref }})
dickreuter commented
•
edited by jorisvandenbossche
Loading
Continuation of #14505
Added errors{'raise','ignore'} for keys not found in meta for json_normalize
Current coverage is 85.26% (diff: 100%)
@@ master #14583 diff @@
Files 144 144
Lines 50980 50985 +5
Methods 0 0
Messages 0 0
Branches 0 0
- Hits 43470 43474 +4
- Misses 7510 7511 +1
Partials 0 0
Powered by Codecov. Last update 14e4815...701c140
Cleaning it up, should be better now. Is there a new whatsnew file so I can add the comments? What's the next version number?
Why is nobody merging this?
| If True, prefix records with dotted (?) path, e.g. foo.bar.field if |
|---|
| path to records is ['foo', 'bar'] |
| meta_prefix : string, default None |
| error: {'raise', 'ignore'}, default 'raise' |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need a version in 0.20.0 tag
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need a version added tag
| Bug Fixes |
| ~~~~~~~~~ |
| - Enhancement in ``pandas.io.json.json_normalize``If meta keys are not always present a new option to set errors='ignore' has been implemented (:issue:`14505`) |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can put in enhancements section
| Bug Fixes |
| ~~~~~~~~~ |
| - Enhancement in ``pandas.io.json.json_normalize``If meta keys are not always present a new option to set errors='ignore' has been implemented (:issue:`14505`) |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
space after quotes and before If. put errors='ignore' in double-backticks.
| meta_val = np.nan |
|---|
| else: |
| raise KeyError( |
| "Try running with errors='ignore' as the following key may not always be present: " + str(e)) |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this won't pass linting
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not? What do I need to change?
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We check for PEP8, so this line will be too long. Normally you can set your editor to check for those things, but can also see the output of the check on travis (3d build): https://travis-ci.org/pandas-dev/pandas/jobs/174180562 (at the bottom)
pandas/io/json.py:746:80: E501 line too long (85 > 79 characters)
pandas/io/json.py:853:80: E501 line too long (129 > 79 characters)
pandas/io/tests/json/test_json_norm.py:229:5: E303 too many blank lines (2)
pandas/io/tests/json/test_json_norm.py:271:80: E501 line too long (81 > 79 characters)
pandas/io/tests/json/test_json_norm.py:272:80: E501 line too long (104 > 79 characters)
pandas/io/tests/json/test_json_norm.py:273:17: E225 missing whitespace around operator
pandas/io/tests/json/test_json_norm.py:274:10: E128 continuation line under-indented for visual indent
pandas/io/tests/json/test_json_norm.py:275:10: E128 continuation line under-indented for visual indent
pandas/io/tests/json/test_json_norm.py:276:10: E128 continuation line under-indented for visual indent
pandas/io/tests/json/test_json_norm.py:277:10: E128 continuation line under-indented for visual indent
| def test_json_normalise_fix(self): |
| # issue 14505 |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
put a comment explaining what this is. rename test -> test_json_normalize_errors
| 'price': {0: '0', 1: '0', 2: '0', 3: '0'}, |
|---|
| 'symbol': {0: 'AAPL', 1: 'GOOG', 2: 'AAPL', 3: 'GOOG'}} |
| self.assertEqual(j.fillna('').to_dict(), expected) |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
test the same with errors='raise'
@dickreuter we have 107 open PR's. very few people do code review. simply ping if you want someone to look.
@dickreuter just so you know: we don't get notifications when you push new code, only from comments. So it's always good to put a comment after you pushed new code to indicated that you updated. And as @jreback says, pinging like you now did always helps
dickreuter commented
•
edited by jorisvandenbossche
Loading
Ok great. Please have a look. I think this can now be merged.
@dickreuter Did you push the changes? As there is still a merge conflict and there is no new commit since @jreback 's comments
Just pushed the changes as instructed.
Anything else I need to fix in this or can this be merged?
Just did another rebase and pushed again. Can this be merged now? thanks
| ^^^^^^^^^^^^^^^^^^ |
|---|
| - ``pd.read_excel`` now preserves sheet order when using ``sheetname=None`` (:issue:`9930`) |
| - ``pandas.io.json.json_normalize`` If meta keys are not always present a new option to set errors="ignore" (:issue:`14583`) |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In pandas.io.json.json_normalize gained the option errors='ignore'|raise (in double-backticks); the default is raise and is backward compatible.
| .. _whatsnew_0200.bug_fixes: |
|---|
| Bug Fixes |
| ~~~~~~~~~ |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
revert this
| path to records is ['foo', 'bar'] |
|---|
| meta_prefix : string, default None |
| error: {'raise', 'ignore'}, default 'raise' |
| * ignore: will ignore keyErrors if keys listed in meta are not always present |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add an entry for raise
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will ignore KeyError, if the keys passed in meta are not all present
| meta_val = np.nan |
|---|
| else: |
| raise \ |
| KeyError( |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
better comment please
| def test_json_normalize_errors(self): |
| # If meta keys are not always present a new option to set errors='ignore' has been implemented (:issue:`14583`) |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GH14583
just some doc comments, otherwise lgtm.
Implemented doc amendments as suggested. Let me know if anything else needs to be changed. thanks
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some PEP8 style issues, otherwise looks good!
| meta_prefix : string, default None |
|---|
| error: {'raise', 'ignore'}, default 'raise' |
| * ignore: will ignore KeyError if keys listed in meta are not always present |
| * raise: will raise KeyError if keys listed in meta are not always present |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect those two lines above are too long (> 80, we check for PEP8, which will let travis fail)
| if errors == 'ignore': |
|---|
| meta_val = np.nan |
| else: |
| raise KeyError("Try running with errors='ignore' as key %s is not always present.", e) |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here, for PEP8 line too long
| def test_json_normalize_errors(self): |
| # GH14583: If meta keys are not always present a new option to set errors='ignore' has been implemented |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here and in other places below line length is also a problem
@dickreuter You still have some style issues (that's the reason that travis is failing). You can always look in the log of the third travis build to see the issues (https://travis-ci.org/pandas-dev/pandas/jobs/182143127, scroll to the bottom):
Linting *.py
pandas/io/json.py:750:1: W293 blank line contains whitespace
pandas/io/json.py:857:80: E501 line too long (81 > 79 characters)
pandas/io/json.py:858:80: E501 line too long (85 > 79 characters)
pandas/io/tests/json/test_json_norm.py:229:5: E303 too many blank lines (2)
pandas/io/tests/json/test_json_norm.py:266:80: E501 line too long (81 > 79 characters)
pandas/io/tests/json/test_json_norm.py:267:80: E501 line too long (87 > 79 characters)
pandas/io/tests/json/test_json_norm.py:269:17: E225 missing whitespace around operator
pandas/io/tests/json/test_json_norm.py:270:10: E128 continuation line under-indented for visual indent
pandas/io/tests/json/test_json_norm.py:271:10: E128 continuation line under-indented for visual indent
pandas/io/tests/json/test_json_norm.py:272:10: E128 continuation line under-indented for visual indent
pandas/io/tests/json/test_json_norm.py:273:10: E128 continuation line under-indented for visual indent
pandas/io/tests/json/test_json_norm.py:278:17: E128 continuation line under-indented for visual indent
pandas/io/tests/json/test_json_norm.py:278:80: E501 line too long (86 > 79 characters)
pandas/io/tests/json/test_json_norm.py:279:28: E127 continuation line over-indented for visual indent
pandas/io/tests/json/test_json_norm.py:279:80: E501 line too long (87 > 79 characters)
Linting *.py DONE
(it also a good idea to set up your IDE to warn for this)
I believe this is fixed now. Some checks are still not successful but I can't find any reason why. Any suggestions why? All it says is:
The command "ci/lint.sh" exited with 1.
0.00s$ echo "script done"
script done
git diff master | flake8 -diff
Linting still seems to fail, even though when I do "git diff master | flake8 --diff" I only get problems in v0.20.0.txt, all the rest is fine. Any suggestions how I can find the remaining issues?
@dickreuter It was not linting this time, the mac build failed.
And it is related to my code? I can't find any details.
@dickreuter you need to rebase on master
git rebase -i origin/master thengit push -f
dickreuter added 5 commits
… meta parameter that don't always occur in every item of the list
Added documentation and test for issue #14505 Added keyword errors {'raise'|'ignore} Shortened what's new Removed commas in dictionary for linting compatibility Updated doc
Now all checked have passed.
ischurov pushed a commit to ischurov/pandas that referenced this pull request
…on_normalize
Author: dickreuter dickreuter@yahoo.com
Closes pandas-dev#14583 from dickreuter/json_normalize_enhancement and squashes the following commits:
701c140 [dickreuter] adjusted formatting 3c94206 [dickreuter] shortened lines to pass linting 2028924 [dickreuter] doc changes d298588 [dickreuter] Fixed as instructed in pull request page bcfbf18 [dickreuter] Avoids exception when pandas.io.json.json_normalize