Issue 672132: registry functions don't handle null characters (original) (raw)

As determined via http://mail.python.org/pipermail/python-win32/2003-January/000745.html

Registry value names (not just the data values) can contain embedded NULL characters, and indeed appear to be Unicode - ie, WinNT registry uses Unicode natively, and the value names are MBCS encoded in the "A" API version of these functions.

_winreg.EnumValue, _winreg.EnumKey (and the same in the win32api module) are affected.

I wonder if a fix to this should actually return Unicode objects if a high-byte exists? At the very least, we must use the "length" of the valuename returned, rather than assuming null termination.

Logged In: YES user_id=21627

The "traditional" Python approach is to return a byte string if possible, for compatibility, and a Unicode object otherwise.

"If possible" often means "if the system default encoding permits", or, as in _tkinter, "if the string is plain ASCII".

I expect one day people will complain that they can't access certain registry keys, because those use characters not supported in CP_ACP.

So it might be reasonable to use the *W functions throughout, and convert to byte strings if they are ASCII, and to Unicode objects otherwise. For incoming byte strings, you probably have to assume they are CP_ACP encoded, for compatibility with earlier Python releases.