[Python-Dev] Removal of Win32 ANSI API (original) (raw)

Victor Stinner victor.stinner at haypocalc.com
Sun Nov 14 01:06:55 CET 2010


On Saturday 13 November 2010 17:21:37 you wrote:

On 2010/11/12 4:26, Victor Stinner wrote: > On Thursday 11 November 2010 17:07:28 Hirokazu Yamamoto wrote: >> Hello. Is it possible to remove Win32 ANSI API (ie: GetFileAttributesA) >> and only use Win32 WIDE API (ie: GetFileAttributesW)? >> Mainly in posixmodule.c. > > Even if I hate the MBCS encoding, because it replaces undecodable

characters > by similar glyphs by default, I'm not certain that it is a good idea to drop > the bytes API. On 2010/11/12 21:08, Victor Stinner wrote: > On Thursday 11 November 2010 23:01:32 you wrote: >>> Sure, it will divide the number of lines, of the code specific to >>> Windows, by two. >> >> Can we get most of the code cleanup benefit without the backwards >> compatibility risk by doing the decode from 'mbcs' on our side of the >> fence? > > I created PyUnicodeFSDecoder, a ParseTuple converter used to work on > unicode paths, instead of bytes paths. On Windows, this converter uses > mbcs encoding in strict mode, whereas Windows converter uses replace > error handler to decode, and ignore to encode. So I don't think that we > should this converter on Windows. > >> That is, have code that was the C equivalent of: >> >> argisbytes = not isinstance(arg, str) >> >> if argisbytes: >> val = decodembcs(arg) >> # Decoding error checking here >> >> else: >> val = arg >> >> # Common processing using WIDE API >> >> if argisbytes: >> result = encodembcs(wideresult) >> # Encoding error checking here >> >> else: >> result = wideresult > > This doesn't make the code shorter, it may be longer than the actual > code, and it is less compliant with the Windows native API... Is it possible to implement new PyArgParseTuple converter to use PyUnicodeDecode(const char *s, Pyssizet size, const char encoding, / mbcs */ const char errors) / replace */ and use it?

Yes, but how do you check if the input argument is a bytes or a str object with your PyArg_Parse converter? You should use "O" format and manually convert it to unicode, and then convert the result back to bytes (if the input was bytes). It don't think that it makes the code shorter.

The code is currently working. The question is if we have to drop the ANSI API now, later or never. It looks like the decision moves to "later" (deprecate in 3.2, remove in 3.3). I still think that drop now doesn't really hurt.

Victor



More information about the Python-Dev mailing list