Issue 22802: On Windows, if you try and use ccs=UTF-8 (or other variants) the U is removed (original) (raw)

As you can see below, the code in fileobject.c is removing the U from the UTF-8 (or UNICODE) when it tries to replace a U for universal line ending mode.

Python 2.7.8 (default, Jun 30 2014, 16:03:49) [MSC v.1500 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information.

f = open('newfile.txt', 'rt+,ccs=UNICODE') Traceback (most recent call last): File "", line 1, in ValueError: Invalid mode ('rbt+,ccs=NICODE') f = open('newfile.txt', 'rt+,ccs=UTF-8') Traceback (most recent call last): File "", line 1, in ValueError: Invalid mode ('rbt+,ccs=TF-8')

It looks to be an issue with the code around:

https://hg.python.org/cpython/file/ee879c0ffa11/Objects/fileobject.c#l283

open() does not support arbitrary platform flags in its mode argument. To open encoded files and transparently decode them to Unicode strings, please use io.open() on Python 2, and pass the correct "encoding" argument. On Python 3, the builtin open() is the same as io.open().