[Python-Dev] Unicode entities in XML cause problems :-( (original) (raw)

Paul Prescod paul@prescod.net
Sat, 27 Apr 2002 13:32:58 -0700


Matthias Urlichs wrote:

Playing around with xml.dom.minidom, I noticed that this beast is perfectly able to read HTML which it can't print: >>> import xml.dom.minidom as md >>> d=md.parseString("bߐ")) >>> d.writexml(sys.stdout) ... UnicodeError: ASCII encoding error: ordinal not in range(128)

"sys.stdout" doesn't know what to do with Unicode. Wrap it in an encoder (usually UTF-8) using the codecs module.

I agree that this is a usability problem but it isn't a bug and I think you've mischaracterized the source of the problem.

Paul Prescod