rdflib.Literals need a well-formed / ill-formed status flag · Issue #848 · RDFLib/rdflib (original) (raw)

I encountered this issue while working on pySHACL.
Specifically, this bug is causing a failure in several of the tests in the standard data-shapes-test-suite here or-datatypes-001.ttl and datatype-ill-formed.ttl
This test relies on the assertion that literals such as "none"^^xsd:boolean and "300"^^xsd:byte should be considered by the validator to be ill-formed literals, so that when checking rules such as sh:datatype, if validating that this property value is a well-formed Literal of type xsd:boolean or xsd:byte (respectively) this should Fail.

Currently, "none"^^xsd:boolean is parsed to a Literal with value=False, datatype=xsd:boolean and "300"^^xsd:byte is parsed to a Literal with value=int(300) and datatype=xsd:byte, so the validation checks which should fail actually pass.

An ideal solution would be at Literal-creation time before converting the lexical value to a Python value, check if it is ill-formed first, store that as an ill_formed flag on the Literal itself, then do the conversion as normal.