Issue 36299: array: Deprecate 'u' type in array module (original) (raw)

Created on 2019-03-15 05:50 by methane, last changed 2022-04-11 14:59 by admin.

Pull Requests
URL Status Linked Edit
PR 12497 closed methane,2019-03-22 10:43
Messages (12)
msg337967 - (view) Author: Inada Naoki (methane) * (Python committer) Date: 2019-03-15 05:50
The doc says: > 'u' will be removed together with the rest of the Py_UNICODE API. > Deprecated since version 3.3, will be removed in version 4.0. > https://docs.python.org/3/library/array.html But DeprecationWarning is not raised yet. Let's raise it. * 3.8 -- PendingDeprecationWarning * 3.9 -- DeprecationWarning * 4.0 or 3.10 -- Remove it.
msg338031 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2019-03-15 20:45
'4.0' is a stand-in for 'sometime after 2.7.final', scheduled for Jan 2020. A Pending... for 3.8.0, scheduled for Oct 2019, seems reasonable to me. Perhaps we should have a pydev discussion for the general issue of post 2.7 removals of already deprecated items.
msg338595 - (view) Author: Inada Naoki (methane) * (Python committer) Date: 2019-03-22 09:13
https://mail.python.org/pipermail/python-dev/2019-March/156807.html We may able to convert 'u' to wchar_t to int32_t and un-deprecate it.
msg338598 - (view) Author: Inada Naoki (methane) * (Python committer) Date: 2019-03-22 10:49
I found converting Py_UNICODE to Py_UCS4 wad happened, and reverted. ref: https://bugs.python.org/issue13072
msg338607 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2019-03-22 14:44
I think the problem is still whether to use 'u' == UCS2 and 'w' == UCS4 like in PEP-3118. For the project I'm currently working on I'd need these for buffer exports: >>> from xnd import * >>> x = xnd(["abc", "xyz"], dtype="fixed_string(10, 'utf16')") >>> y = xnd(["abc", "xyz"], dtype="fixed_string(10, 'utf32')") >>> >>> memoryview(x) Traceback (most recent call last): File "", line 1, in ValueError: type is not supported by the buffer protocol The use case is not an array that represents a single utf16 string, but an array *of* fixed strings with different encodings. So x would be exported with format 'u' and y with format 'w'.
msg338608 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2019-03-22 15:01
Just to demonstrate what the format would look like, this is working for an array of fixed bytes: >>> x = xnd([b"123", b"23456"], dtype="fixed_bytes(size=10)") >>> memoryview(x).format '10s' So the formats in the previous message would be '10u' and '10w'.
msg338609 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2019-03-22 15:03
array('u') is not tied with the legacy Unicode C API. It is possible to use the modern wchar_t based Unicode C API for it. See . There are benefits from getting rid of the legacy Unicode C API, but not from array('u').
msg338610 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2019-03-22 15:10
array() uses struct module characters except for 'u'. PEP-3118 was supposed to be implemented in the struct module. If array() continues to use 'u', the only sensible thing would be to remove (or rename) 'a', 'u' and 'w' from PEP-3118.
msg338611 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2019-03-22 15:25
The funny thing is that array() already knows this: >>> import array >>> a = array.array("u", "123") >>> memoryview(a).format 'w'
msg367000 - (view) Author: Inada Naoki (methane) * (Python committer) Date: 2020-04-22 13:16
I closed GH-12497 (Py_UNICODE -> Py_UCS4). I created GH-19653 (Py_UNICODE -> wchar_t) instead.
msg367044 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2020-04-22 19:15
Should this issue be closed, possibly as superseded by #36346, the issue for the new PR-19653?
msg367065 - (view) Author: Inada Naoki (methane) * (Python committer) Date: 2020-04-23 00:47
While array('u') doesn't use deprecated API with GH-19653, I still don't like 'u' because: * I don't have any reason to use platform dependant wchar_t. [1] * It is not consistent with PEP-3118. [1]: https://mail.python.org/pipermail/python-dev/2019-March/156807.html How about this plan? * Add 'w' for Py_UCS4. * Deprecate 'u', and remove it in the future.
History
Date User Action Args
2022-04-11 14:59:12 admin set github: 80480
2020-04-23 00:47:43 methane set messages: +
2020-04-22 19:15:06 terry.reedy set messages: +
2020-04-22 13:16:26 methane set messages: +
2019-03-22 15:56:45 vstinner set nosy: - vstinner
2019-03-22 15:25:27 skrah set messages: +
2019-03-22 15:10:07 skrah set messages: +
2019-03-22 15:03:02 serhiy.storchaka set nosy: + serhiy.storchakamessages: +
2019-03-22 15:01:08 skrah set messages: +
2019-03-22 14:44:21 skrah set messages: +
2019-03-22 11:26:51 methane set nosy: + ncoghlan, vstinner, skrahstage: patch review -> title: Deprecate 'u' type in array module -> array: Deprecate 'u' type in array module
2019-03-22 10:49:09 methane set messages: +
2019-03-22 10:43:34 methane set keywords: + patchstage: patch reviewpull_requests: + <pull%5Frequest12447>
2019-03-22 09:13:15 methane set messages: +
2019-03-15 20:45:47 terry.reedy set nosy: + terry.reedymessages: +
2019-03-15 05:50:02 methane create