Add configurable processing limits for JSON parser (StreamReadConstraints
) · Issue #637 · FasterXML/jackson-core (original) (raw)
(note: related to/inspired by FasterXML/jackson-databind#2816)
Some aspects of input document are prone to possible abuse, so that malicious sender can create specifically crafted documents to try to overload server. This includes things like:
- Ridiculously deeply nested documents (it only takes 2 characters to create Array context so 8k document can have 4000 levels of nesting) -- implemented via Add StreamReadConstraints.maxNestingDepth() to constraint max nesting depth (default: 1000) #943
- Very long documents (in general): while streaming decoder has no problems with length per se, higher level databind can easily exhaust memory even with somewhat normal amplification (that is, input content of 10 megabytes can result in a data structure of 100 megabytes) -- implemented via Add configurable limit for the maximum number of bytes/chars of content to parse before failing #1046
- Very long names: Jackson's name decoding is optimized for short names and while it can handle any length (within bounds of total memory available) performance characteristics are not great beyond, say, couple of thousands of characters -- implemented via Add configurable limit for the maximum length of Object property names to parse before failing (default max: 50,000 chars) #1047
- Very long numbers: textual length of tens of thousands of numbers might be problematic for some uses -- implemented via Add numeric value size limits via StreamReadConstraints (fixes sonatype-2022-6438) -- default 1000 chars #827
- (possibly?) Huge number of properties per JSON Object -- for some use cases construction of data structures with tens or hundreds of thousands of distinct keys can be problematic -- not implemented, no immediate plans; can create new issue if this is still desired (ditto for big Arrays)
- Extra long text segments (megabytes of content) can similarly become problematic -- this could also result from broken encoding (missing closing quote) -- implemented via Add StreamReadConstraints limit for longest textual value to allow (default: 5M in 2.15.0; 20M in 2.15.1) #863
and although streaming parser can typically handle many of these cases quite well, they can be very problematic for higher-level processing -- and even for streaming, for highly parallel processing.
So. It would be good to create a configurable set of options that:
- Default to same safe set of limits for likely problematic cases (like limit nesting to what is known to typically fit in wrt stack frames; limit maximum property names)
- Leave more speculative limits (text length) to unlimited (or very high)
- Offer a simple way to configure limits (possibly only per
JsonFactory
, although it'd be really nice if per-parser overrides were possible)
Further reading: related material.
Here are some relevant links:
- https://github.com/zio/zio-json#security outlines some general problem areas (overlaps with some of points above)