Issue 35194: A typo in a constant in cp932 codec (original) (raw)

Created on 2018-11-08 22:55 by izbyshev, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 10420 merged izbyshev,2018-11-08 22:57
PR 10423 merged miss-islington,2018-11-09 07:12
PR 10424 merged miss-islington,2018-11-09 07:12
PR 10432 merged izbyshev,2018-11-09 13:39
PR 10433 merged izbyshev,2018-11-09 14:07
Messages (13)
msg329489 - (view) Author: Alexey Izbyshev (izbyshev) * (Python triager) Date: 2018-11-08 22:55
UBSan with -fsanitize=implicit-integer-truncation found a suspicious one: /scratch2/izbyshev/cpython/Modules/cjkcodecs/_codecs_jp.c:43:17: runtime error: implicit conversion from type 'unsigned int' of value 4294966013 (32-bit, unsigned) to type 'unsigned char' changed the value to 253 (8-bit, unsigned) Indeed, the wrong constant was used (the correct one is used in corresponding decoder code at https://github.com/python/cpython/blob/fd512d76456b65c529a5bc58d8cfe73e4a10de7a/Modules/cjkcodecs/_codecs_jp.c#L105). In this case the truncation was harmless because only the lowest byte of the wrong result was used, and it was correct. But it probably makes sense to fix it if only to reduce noise from UBSan. All Python versions are affected, but I've marked 3.8 only since I'm not sure what the policy for backporting such changes is.
msg329500 - (view) Author: miss-islington (miss-islington) Date: 2018-11-09 07:12
New changeset 7a69cf47a9bbc95f95fd67c982bff121b2a903cb by Miss Islington (bot) (Alexey Izbyshev) in branch 'master': bpo-35194: Fix a wrong constant in cp932 codec (GH-10420) https://github.com/python/cpython/commit/7a69cf47a9bbc95f95fd67c982bff121b2a903cb
msg329501 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-11-09 07:23
Maybe add asserts in OUTBYTE1() and similar macros to prevent similar errors in future?
msg329503 - (view) Author: miss-islington (miss-islington) Date: 2018-11-09 07:33
New changeset 49ee41f1c3934aa095e32fa751cdf3ba641ae34b by Miss Islington (bot) in branch '3.6': bpo-35194: Fix a wrong constant in cp932 codec (GH-10420) https://github.com/python/cpython/commit/49ee41f1c3934aa095e32fa751cdf3ba641ae34b
msg329504 - (view) Author: miss-islington (miss-islington) Date: 2018-11-09 07:35
New changeset 22234f1375d28803074405497ea61315fb37240d by Miss Islington (bot) in branch '3.7': bpo-35194: Fix a wrong constant in cp932 codec (GH-10420) https://github.com/python/cpython/commit/22234f1375d28803074405497ea61315fb37240d
msg329508 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-11-09 10:21
> Maybe add asserts in OUTBYTE1() and similar macros to prevent similar errors in future? I like the idea. Make sure that: 0 <= ch <= 255?
msg329514 - (view) Author: Alexey Izbyshev (izbyshev) * (Python triager) Date: 2018-11-09 13:44
I've added 'assert' to macros. Since 'typeof' seems to be disallowed in Python, I've used 'unsigned int' as the type of an intermediate variable. Another alternative is 'assert(0 <= (c) && (c) <= 255)', but 'c' will be evaluated several times.
msg329515 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-11-09 13:50
Converting to unsigned int can unnecessary widen the value or truncate higher bits. On other side, testing "0 <= (c)" can emit a compiler warning if c is unsigned. Maybe test "Py_CHARMASK(c) == (c)"?
msg329516 - (view) Author: Alexey Izbyshev (izbyshev) * (Python triager) Date: 2018-11-09 13:58
> Maybe test "Py_CHARMASK(c) == (c)"? This is a good alternative if multiple evaluation of 'c' is acceptable. Though I'd prefer '(unsigned char)c == c' for this style of fixing because it is bit closer to what happens in '((*outbuf)[i]) = c'.
msg329520 - (view) Author: Alexey Izbyshev (izbyshev) * (Python triager) Date: 2018-11-09 14:27
I've checked than other macros in Modules/cjkcodecs/cjkcodecs.h don't avoid multiple argument evaluation (e.g. OUTCHAR2, _TRYMAP_ENC), so I've changed 'assert' to a variant of what Serhiy suggested.
msg329521 - (view) Author: Alexey Izbyshev (izbyshev) * (Python triager) Date: 2018-11-09 14:35
OUTCHAR2 is a wrong example. Other examples are NEXT_IN, NEXT_OUT.
msg329595 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-11-10 05:47
New changeset 0d165262d949440e5aea6533b10e19e4cd5cf12d by Serhiy Storchaka (Alexey Izbyshev) in branch '2.7': [2.7] bpo-35194: Fix a wrong constant in cp932 codec. (GH-10420) (GH-10433) https://github.com/python/cpython/commit/0d165262d949440e5aea6533b10e19e4cd5cf12d
msg339100 - (view) Author: Inada Naoki (methane) * (Python committer) Date: 2019-03-29 07:49
New changeset 5f45979b63300338b68709bfe817ddae563b93fd by Inada Naoki (Alexey Izbyshev) in branch 'master': bpo-35194: cjkcodec: check the encoded value is not truncated (GH-10432) https://github.com/python/cpython/commit/5f45979b63300338b68709bfe817ddae563b93fd
History
Date User Action Args
2022-04-11 14:59:07 admin set github: 79375
2019-03-29 07:49:13 methane set nosy: + methanemessages: +
2019-03-29 07:48:59 methane set status: open -> closedresolution: fixedstage: patch review -> resolved
2018-11-10 05:47:15 serhiy.storchaka set messages: +
2018-11-09 14:35:21 izbyshev set messages: +
2018-11-09 14:27:12 izbyshev set messages: +
2018-11-09 14:07:58 izbyshev set pull_requests: + <pull%5Frequest9707>
2018-11-09 13:58:03 izbyshev set messages: +
2018-11-09 13:50:06 serhiy.storchaka set messages: +
2018-11-09 13:44:40 izbyshev set messages: +
2018-11-09 13:39:51 izbyshev set pull_requests: + <pull%5Frequest9706>
2018-11-09 10:21:57 vstinner set messages: +
2018-11-09 07:35:09 miss-islington set messages: +
2018-11-09 07:33:13 miss-islington set messages: +
2018-11-09 07:23:19 serhiy.storchaka set messages: + versions: + Python 2.7, Python 3.6, Python 3.7
2018-11-09 07:12:41 miss-islington set pull_requests: + <pull%5Frequest9705>
2018-11-09 07:12:34 miss-islington set pull_requests: + <pull%5Frequest9704>
2018-11-09 07:12:10 miss-islington set nosy: + miss-islingtonmessages: +
2018-11-08 22:57:37 izbyshev set keywords: + patchstage: patch reviewpull_requests: + <pull%5Frequest9700>
2018-11-08 22:55:24 izbyshev create