msg329489 - (view) |
Author: Alexey Izbyshev (izbyshev) *  |
Date: 2018-11-08 22:55 |
UBSan with -fsanitize=implicit-integer-truncation found a suspicious one: /scratch2/izbyshev/cpython/Modules/cjkcodecs/_codecs_jp.c:43:17: runtime error: implicit conversion from type 'unsigned int' of value 4294966013 (32-bit, unsigned) to type 'unsigned char' changed the value to 253 (8-bit, unsigned) Indeed, the wrong constant was used (the correct one is used in corresponding decoder code at https://github.com/python/cpython/blob/fd512d76456b65c529a5bc58d8cfe73e4a10de7a/Modules/cjkcodecs/_codecs_jp.c#L105). In this case the truncation was harmless because only the lowest byte of the wrong result was used, and it was correct. But it probably makes sense to fix it if only to reduce noise from UBSan. All Python versions are affected, but I've marked 3.8 only since I'm not sure what the policy for backporting such changes is. |
|
|
msg329500 - (view) |
Author: miss-islington (miss-islington) |
Date: 2018-11-09 07:12 |
New changeset 7a69cf47a9bbc95f95fd67c982bff121b2a903cb by Miss Islington (bot) (Alexey Izbyshev) in branch 'master': bpo-35194: Fix a wrong constant in cp932 codec (GH-10420) https://github.com/python/cpython/commit/7a69cf47a9bbc95f95fd67c982bff121b2a903cb |
|
|
msg329501 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2018-11-09 07:23 |
Maybe add asserts in OUTBYTE1() and similar macros to prevent similar errors in future? |
|
|
msg329503 - (view) |
Author: miss-islington (miss-islington) |
Date: 2018-11-09 07:33 |
New changeset 49ee41f1c3934aa095e32fa751cdf3ba641ae34b by Miss Islington (bot) in branch '3.6': bpo-35194: Fix a wrong constant in cp932 codec (GH-10420) https://github.com/python/cpython/commit/49ee41f1c3934aa095e32fa751cdf3ba641ae34b |
|
|
msg329504 - (view) |
Author: miss-islington (miss-islington) |
Date: 2018-11-09 07:35 |
New changeset 22234f1375d28803074405497ea61315fb37240d by Miss Islington (bot) in branch '3.7': bpo-35194: Fix a wrong constant in cp932 codec (GH-10420) https://github.com/python/cpython/commit/22234f1375d28803074405497ea61315fb37240d |
|
|
msg329508 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2018-11-09 10:21 |
> Maybe add asserts in OUTBYTE1() and similar macros to prevent similar errors in future? I like the idea. Make sure that: 0 <= ch <= 255? |
|
|
msg329514 - (view) |
Author: Alexey Izbyshev (izbyshev) *  |
Date: 2018-11-09 13:44 |
I've added 'assert' to macros. Since 'typeof' seems to be disallowed in Python, I've used 'unsigned int' as the type of an intermediate variable. Another alternative is 'assert(0 <= (c) && (c) <= 255)', but 'c' will be evaluated several times. |
|
|
msg329515 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2018-11-09 13:50 |
Converting to unsigned int can unnecessary widen the value or truncate higher bits. On other side, testing "0 <= (c)" can emit a compiler warning if c is unsigned. Maybe test "Py_CHARMASK(c) == (c)"? |
|
|
msg329516 - (view) |
Author: Alexey Izbyshev (izbyshev) *  |
Date: 2018-11-09 13:58 |
> Maybe test "Py_CHARMASK(c) == (c)"? This is a good alternative if multiple evaluation of 'c' is acceptable. Though I'd prefer '(unsigned char)c == c' for this style of fixing because it is bit closer to what happens in '((*outbuf)[i]) = c'. |
|
|
msg329520 - (view) |
Author: Alexey Izbyshev (izbyshev) *  |
Date: 2018-11-09 14:27 |
I've checked than other macros in Modules/cjkcodecs/cjkcodecs.h don't avoid multiple argument evaluation (e.g. OUTCHAR2, _TRYMAP_ENC), so I've changed 'assert' to a variant of what Serhiy suggested. |
|
|
msg329521 - (view) |
Author: Alexey Izbyshev (izbyshev) *  |
Date: 2018-11-09 14:35 |
OUTCHAR2 is a wrong example. Other examples are NEXT_IN, NEXT_OUT. |
|
|
msg329595 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2018-11-10 05:47 |
New changeset 0d165262d949440e5aea6533b10e19e4cd5cf12d by Serhiy Storchaka (Alexey Izbyshev) in branch '2.7': [2.7] bpo-35194: Fix a wrong constant in cp932 codec. (GH-10420) (GH-10433) https://github.com/python/cpython/commit/0d165262d949440e5aea6533b10e19e4cd5cf12d |
|
|
msg339100 - (view) |
Author: Inada Naoki (methane) *  |
Date: 2019-03-29 07:49 |
New changeset 5f45979b63300338b68709bfe817ddae563b93fd by Inada Naoki (Alexey Izbyshev) in branch 'master': bpo-35194: cjkcodec: check the encoded value is not truncated (GH-10432) https://github.com/python/cpython/commit/5f45979b63300338b68709bfe817ddae563b93fd |
|
|