msg337967 - (view) |
Author: Inada Naoki (methane) *  |
Date: 2019-03-15 05:50 |
The doc says: > 'u' will be removed together with the rest of the Py_UNICODE API. > Deprecated since version 3.3, will be removed in version 4.0. > https://docs.python.org/3/library/array.html But DeprecationWarning is not raised yet. Let's raise it. * 3.8 -- PendingDeprecationWarning * 3.9 -- DeprecationWarning * 4.0 or 3.10 -- Remove it. |
|
|
msg338031 - (view) |
Author: Terry J. Reedy (terry.reedy) *  |
Date: 2019-03-15 20:45 |
'4.0' is a stand-in for 'sometime after 2.7.final', scheduled for Jan 2020. A Pending... for 3.8.0, scheduled for Oct 2019, seems reasonable to me. Perhaps we should have a pydev discussion for the general issue of post 2.7 removals of already deprecated items. |
|
|
msg338595 - (view) |
Author: Inada Naoki (methane) *  |
Date: 2019-03-22 09:13 |
https://mail.python.org/pipermail/python-dev/2019-March/156807.html We may able to convert 'u' to wchar_t to int32_t and un-deprecate it. |
|
|
msg338598 - (view) |
Author: Inada Naoki (methane) *  |
Date: 2019-03-22 10:49 |
I found converting Py_UNICODE to Py_UCS4 wad happened, and reverted. ref: https://bugs.python.org/issue13072 |
|
|
msg338607 - (view) |
Author: Stefan Krah (skrah) *  |
Date: 2019-03-22 14:44 |
I think the problem is still whether to use 'u' == UCS2 and 'w' == UCS4 like in PEP-3118. For the project I'm currently working on I'd need these for buffer exports: >>> from xnd import * >>> x = xnd(["abc", "xyz"], dtype="fixed_string(10, 'utf16')") >>> y = xnd(["abc", "xyz"], dtype="fixed_string(10, 'utf32')") >>> >>> memoryview(x) Traceback (most recent call last): File "", line 1, in ValueError: type is not supported by the buffer protocol The use case is not an array that represents a single utf16 string, but an array *of* fixed strings with different encodings. So x would be exported with format 'u' and y with format 'w'. |
|
|
msg338608 - (view) |
Author: Stefan Krah (skrah) *  |
Date: 2019-03-22 15:01 |
Just to demonstrate what the format would look like, this is working for an array of fixed bytes: >>> x = xnd([b"123", b"23456"], dtype="fixed_bytes(size=10)") >>> memoryview(x).format '10s' So the formats in the previous message would be '10u' and '10w'. |
|
|
msg338609 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2019-03-22 15:03 |
array('u') is not tied with the legacy Unicode C API. It is possible to use the modern wchar_t based Unicode C API for it. See . There are benefits from getting rid of the legacy Unicode C API, but not from array('u'). |
|
|
msg338610 - (view) |
Author: Stefan Krah (skrah) *  |
Date: 2019-03-22 15:10 |
array() uses struct module characters except for 'u'. PEP-3118 was supposed to be implemented in the struct module. If array() continues to use 'u', the only sensible thing would be to remove (or rename) 'a', 'u' and 'w' from PEP-3118. |
|
|
msg338611 - (view) |
Author: Stefan Krah (skrah) *  |
Date: 2019-03-22 15:25 |
The funny thing is that array() already knows this: >>> import array >>> a = array.array("u", "123") >>> memoryview(a).format 'w' |
|
|
msg367000 - (view) |
Author: Inada Naoki (methane) *  |
Date: 2020-04-22 13:16 |
I closed GH-12497 (Py_UNICODE -> Py_UCS4). I created GH-19653 (Py_UNICODE -> wchar_t) instead. |
|
|
msg367044 - (view) |
Author: Terry J. Reedy (terry.reedy) *  |
Date: 2020-04-22 19:15 |
Should this issue be closed, possibly as superseded by #36346, the issue for the new PR-19653? |
|
|
msg367065 - (view) |
Author: Inada Naoki (methane) *  |
Date: 2020-04-23 00:47 |
While array('u') doesn't use deprecated API with GH-19653, I still don't like 'u' because: * I don't have any reason to use platform dependant wchar_t. [1] * It is not consistent with PEP-3118. [1]: https://mail.python.org/pipermail/python-dev/2019-March/156807.html How about this plan? * Add 'w' for Py_UCS4. * Deprecate 'u', and remove it in the future. |
|
|