[DRAFT] Deep nesting check by pjfanning · Pull Request #926 · FasterXML/jackson-core (original) (raw)
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Conversation20 Commits16 Checks0 Files changed
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters
[ Show hidden characters]({{ revealButtonHref }})
- a proposal for how we might police the depth f input docs
- if this approach is generally ok, I can add more javadoc and refactor the new test
@@ -1547,4 +1549,19 @@ protected void loadMoreGuaranteed() throws IOException { |
---|
// Can't declare as deprecated, for now, but shouldn't be needed |
protected void _finishString() throws IOException { } |
protected final void createChildArrayContext(final int lineNr, final int colNr) { |
_streamReadConstraints.validateDepth(++_depth); |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be good to measure performance impact; if there is some, we could instead read max depth into local setting, inline check (that is, method local to parser).
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes - I'll set up some benchmark testing
Ah. So far I had been thinking of storing the depth layer in actual parsing context object (since it's always +1 of parent).
But it is true that we could instead just hold on to dynamic nesting in parser/generator, and avoid extra storage.
There is slightly higher risk of missed updates (everywhere where child context created/parent referenced we MUST update depth) but maybe it's worth it.
I think this approach makes sense; added some comments.
* |
---|
* @since 2.15 |
*/ |
public Builder maxDepth(final int maxDepth) { |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maxDepth
could work; another possibility maxNesting
. Not sure what terminology is most commonly used with JSON parsers.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've changed this to maxNestingDepth
@cowtowncoder I've done a little benchmark testing using https://github.com/pjfanning/jackson-nest-bench - the results are a little erratic. I don't really have dedicated hardware for doing accurate benchmark testing. I'll do some more runs.
One side issue is that I think I made mistake with using IllegalStateException in StreamReadConstraints. I used it in the validateString method and am using for consistency on the method for validating the nesting depth. Would it be ok to use a checked exception (one that does extend RuntimeException)? Maybe, JsonParseException. validateString is new in jackson 2.15. Changing the exception type will break some code in jackson-databind and the jackson-dataformat* jars but can be quickly fixed.
@cowtowncoder I've done a little benchmark testing using https://github.com/pjfanning/jackson-nest-bench - the results are a little erratic. I don't really have dedicated hardware for doing accurate benchmark testing. I'll do some more runs.
Ok. I should probably try building and running on my end too, on my desktop, to get another datapoint.
One side issue is that I think I made mistake with using IllegalStateException in StreamReadConstraints. I used it in the validateString method and am using for consistency on the method for validating the nesting depth. Would it be ok to use a checked exception (one that does extend RuntimeException)? Maybe, JsonParseException. validateString is new in jackson 2.15. Changing the exception type will break some code in jackson-databind and the jackson-dataformat* jars but can be quickly fixed.
Ah yes, I have been thinking about this. I think that IllegalStateException
is not optimal and we should throw a JsonParseException
instead, yes.
For all violations.
Looks like it would be difficult in Jackson v2 to use a checked exception, like JsonParseException, for the string length checks. This is because the exceptions are thrown in TextBuffer class and that class does not have declared exceptions - on methods like getContentsAsString() for instance. Maybe, we can change this in jackson-core v3?
@cowtowncoder in https://github.com/pjfanning/jackson-nest-bench, there seems no benefit in moving the validate method so that the limit is stored locally (the V2 implemention in jackson-nest-bench). For some reason, there does appear to be a little bit of a slow down in the Reader based JsonParser but not much proportionally. The byte stream parsers seem to not slow down at all.
Ok, one non-trivial concern: not all format implementations extend ParserBase
-- some use MinimalParserBase
.
For those, I think it's necessary to duplicate much of handling at layer above.
Not sure if there is any reasonable way around that, just noting the concern.
Good to know performance might not be affected: I should really try building and running jackson-benchmarks
for 2.15 with and without modifications.
The parsing context implementations seem quite different - so different that what I have put in ParserBase won't work for the use cases where ParserMinimalBase is used instead. I think I'll end up having to write specific implementations of the validations for each class that extends ParserMinimalBase.
I have moved the _nestingDepth
to ParserMinimalBase.
@pjfanning Correct, different implementations are needed for various reasons, for better or worse.
One thing I managed to do is to run preliminary checks on jackson-benchmarks
vs 2.14.2 and 2.15.0-SNAPSHOT (last full runs are from year ago for 2.13.3), and it does not look like performance of 2.15.0-SNAPSHOT before these changes is measurably slower for the benchmark(s) in question. For what that's worth.
But at least it is possible to isolate effects of changes; I can build from this branch/PR locally, see if I spot any immediate differences.
EDIT: no significant performance degradation with changes. Might have 2-3% overall reduction from 2.14 to 2.15 for json, but hard to say -- and definitely not due to nesting change, even if there is one.
2 participants