Make serialize() on a CONSTRUCT result act like normal g.serialize() · RDFLib/rdflib · Discussion #1612 (original) (raw)

I have done some digging to make sense of the IO types in python, and thought a bit how to actually deal with this situation.

Some related inquiries and issues, mostly related to typing:

I think if there was a clean slate, the best option would have been to only accept BinaryIO like buffers (i.e. io.RawIOBase or io.BufferedIOBase), with optional encoding, and if no encoding is supplied, and if the serializer supports multiple encodings, default to system preferred encoding (similar to what TextIOWrapper does).

However, given that some ResultSerializers work with BinaryIO, and some with TextIO, this will break compatibility, so probably the best compromise to maintain interface is for ResultSerializer.serialize() to accept both BinaryIO with optional encoding, and TextIO without any encoding. And then when ResultSerializer.serialize() defers to Graph.serialize(), just use TextIO.buffer and TextIO.encoding and pass that to Graph.serialize().

Further I will also try and ensure the default encoding is utf-8 throughout all encoders.

For formats that only allow one encoding (turtle) I think we should reconsider the behaviour if unsupported encodings are requested. Either we should always support arbitrary encodings, or always raise an exception if an unsupported encoding is supported. The current approach is to warn if an unsupported encoding is requested, and this may result in unexpected behaviour. I will however defer changing this for now, as it can be dealt with in another PR.