Add optional orjson support for faster json reading and writing by ashleysommer · Pull Request #2854 · RDFLib/rdflib (original) (raw)
This adds optional support for orjson, that can be enabled by installing rdflib with pip extras syntax like rdflib[orjson]
, or poetry extras syntax like --extras orjson
, or finally it will be detected and used if you simply install orjson>=3.9.14
in your python environment.
This PR touches a lot of files, because JSON is surprisingly used in a whole lot of different places in rdflib.
- JSON-LD Graph Serializer
- JSON-LD Graph Parser
- rdf:JSON literal support in JSON-LD documents
- Hextuples Graph Serializer (that uses a special newline-delimited JSON)
- Hextuples Graph Parser (parses JSON line-by line from a ND-JSON file)
- sparql-results-json SPARQLResults Parser
- sparql-results-json SPARQLResults Serializer
- The (yet-to-be-released) GeoJSON-LD serializer
There are also some tangential non-JSON related changes to stream handling in a bunch of other SPARQLResult serializers. While implementing the orjson support for sparql-results-json serializer I found some errors in the way all of the different Sparql-Results-Serializers treat TextIO and BinaryIO streams. This was causing 7 errors to be thrown by the rdflib serializer tests, but they were marked as ignored in the test suite.
These additional changes include much better Typing to the Sparql-Results-Serializer subclasses, which exposed where the problems were (hooray for typed Python exposing actual errors). Fixes were made to all of the failing Sparql-Results-Serializer subclasses, and there are no skipped tests now. This also allowed the removal of a bunch of mypy type: ignore
patches that were in place to silence the complaining type checker.
I know it would be great to move all those additional changes to a different PR, but there are two reasons I didnt:
- The addition of the orjson feature relies on those typing changes and BinaryIO stream fixes.
- The sparql-results-serializer fixes for specifically the sparql-results-json (SparqlResultsJson) subclass is too entangled with the orjson feature to be able to be extracted easily.
Fixes #2784