[Python-Dev] XML codec? (original) (raw)
"Martin v. Löwis" martin at v.loewis.de
Sun Nov 11 14:40:44 CET 2007
- Previous message: [Python-Dev] XML codec?
- Next message: [Python-Dev] XML codec?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
I don't know. Is an XML document ill-formed if it doesn't contain an XML declaration, is not in UTF-8 or UTF-8, but there's external encoding info?
If there is external encoding info, matching the actual encoding, it would be well-formed. Of course, preserving that information would be up to the application.
This looks good. Now we would have to extent the code to detect and replace the encoding in the XML declaration too.
I'm still opposed to making this a codec. Right - for a pure Python solution, the processing of the XML declaration would still need to be implemented.
I think there could be a much simpler routine to have the same effect. - if it's less than 4 bytes, answer "need more data". Can there be an XML document that is less then 4 bytes? I guess not.
No, the smallest document has exactly 4 characters (e.g. ""). However, external entities may be smaller, such as "x".
But anyway: would a Python implementation of these two functions (detectencoding()/fixencoding()) be accepted?
I could agree to a Python implementation of this algorithm as long as it's not packaged as a codec.
Regards, Martin
- Previous message: [Python-Dev] XML codec?
- Next message: [Python-Dev] XML codec?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]