The trailing ? in <?xml version="1.0"?> emits an error (original) (raw)

After upgrading from v1.17.1 to v1.19.1 unit tests started to fail on parsing XML files.

Valid XML file (minimum reproducible, not entire file)

<?xml version="1.0"?>
<catalogs xmlns="http://acalog.com/catalog/1.0" xmlns:h="http://www.w3.org/1999/xhtml"
          xmlns:a="http://www.w3.org/2005/Atom" xmlns:xi="http://www.w3.org/2001/XInclude">
</catalogs>

The error is

Unexpected character '?' in input state [AfterAttributeValue_quoted]

Please note that there is no whitespace between " and ?> in the first line of XML. Once the whitespace is added no parsing error is returned. Valid beginning of file based on Jsoup XML parser <?xml version="1.0" ?>.

Usage

Parser parser = Parser.xmlParser().setTrackErrors(1).newInstance();
Document httpDoc = Jsoup.parse(fileContent, "", parser);
if (!parser.getErrors().isEmpty()) {
    throw new IllegalArgumentException(String.format("Not a valid XML. Error: %s", parser.getErrors()));
}