[Python-Dev] Quick sum up about open() + BOM (original) (raw)

Lennart Regebro regebro at gmail.com
Sat Jan 9 06:48:36 CET 2010


It seems to me that when opening a file, the following is the only flow that makes sense for the typical opening of a file flow:

if encoding is not None: use encoding elif file has BOM: use BOM else: use system default

And hence a encoding='BOM' isn't needed there. Although I'm trying to come up with usecases that doesn't work with this, I can't. :)

BUT

When writing things are not so easy though. Apparently some encodings require a BOM to be written, but others do not, but allow it, and some has no byte order mark. So there you have to be able to write the BOM, or not. And that's either a new parameter, because you can't use encoding='BOM' since you need to specify the encoding as well, or a new method.

I would suggest a BOM parameter, and maybe a method as well.

BOM=None|True|False

Where "None" means a sane default behaviour, that is write a BOM if the encoding require it. "True" means write a BOM if the encoding supports it. "False" means Don't write a BOM even if the encoding requires it (because I know what I'm doing)

if 'w' in mode: # But not 'r' or 'a' if BOM == True and encoding in (ENCODINGS THAT ALLOW BOM): write_bom = True elif BOM == False: write_bom = False elif BOM == None and encoding in (ENCODINGS THAT REQUIRE BOM): write_bom = True else: write_bom = False else: write_bom = False

For reading this parameter could either be a noop, or possibly change the behavior somehow, if a usecase where that makes sense can be imagined.

-- Lennart Regebro: http://regebro.wordpress.com/ Python 3 Porting: http://python-incompatibility.googlecode.com/ +33 661 58 14 64



More information about the Python-Dev mailing list