Add optional orjson support for faster json reading and writing by ashleysommer · Pull Request #2854 · RDFLib/rdflib (original) (raw)

This adds optional support for orjson, that can be enabled by installing rdflib with pip extras syntax like rdflib[orjson], or poetry extras syntax like --extras orjson, or finally it will be detected and used if you simply install orjson>=3.9.14 in your python environment.

This PR touches a lot of files, because JSON is surprisingly used in a whole lot of different places in rdflib.

There are also some tangential non-JSON related changes to stream handling in a bunch of other SPARQLResult serializers. While implementing the orjson support for sparql-results-json serializer I found some errors in the way all of the different Sparql-Results-Serializers treat TextIO and BinaryIO streams. This was causing 7 errors to be thrown by the rdflib serializer tests, but they were marked as ignored in the test suite.

These additional changes include much better Typing to the Sparql-Results-Serializer subclasses, which exposed where the problems were (hooray for typed Python exposing actual errors). Fixes were made to all of the failing Sparql-Results-Serializer subclasses, and there are no skipped tests now. This also allowed the removal of a bunch of mypy type: ignore patches that were in place to silence the complaining type checker.

I know it would be great to move all those additional changes to a different PR, but there are two reasons I didnt:

  1. The addition of the orjson feature relies on those typing changes and BinaryIO stream fixes.
  2. The sparql-results-serializer fixes for specifically the sparql-results-json (SparqlResultsJson) subclass is too entangled with the orjson feature to be able to be extracted easily.

Fixes #2784