Issue 13155: Optimize finding the max character width (original) (raw)

Created on 2011-10-11 23:14 by pitrou, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Files
File name	Uploaded	Description	Edit
find_max_char.patch	pitrou,2011-10-11 23:14	review
find_max_char2.patch	pitrou,2011-10-11 23:22	review
find_max_char3.patch	pitrou,2011-10-11 23:32	review
find_max_char4.patch	pitrou,2011-10-11 23:50	review
find_max_char5.patch	vstinner,2011-10-12 21:10

Messages (12)
msg145372 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2011-10-11 23:14
This patch optimizes scanning for the max character width in an unicode buffer. Micro-benchmarking some worst case situations: $ ./python -m timeit -s "x='é'+'x'100000" "x[1:]" -> before: 10000 loops, best of 3: 74.9 usec per loop -> after: 100000 loops, best of 3: 15.5 usec per loop $ ./python -m timeit -s "x='\U00010000'+'x'100000" "x[1:]" -> before: 10000 loops, best of 3: 138 usec per loop -> after: 10000 loops, best of 3: 71.3 usec per loop
msg145373 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2011-10-11 23:22
I hadn't noticed the STRINGLIB_SIZEOF_CHAR constant. Reuse it instead of adding STRINGLIB_CHAR_SIZE.
msg145374 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2011-10-11 23:32
Slightly cleaned up patch after Victor's comments in private.
msg145375 - (view)	Author: STINNER Victor (vstinner) *	Date: 2011-10-11 23:42
find_max_char() returns 0x10000 instead of 0x10FFFF, which may be wrong (or at least, surprising). You may add a max_char variable using other macros like MAX_CHAR_ASCII, MAX_CHAR_UCS1, ..., which will be set at the same time than mask. Or restore your if (ret >= 0x10000) return 0x10ffff; else return ret;. Constants look inconsistent: + const STRINGLIB_CHAR *unrolled_end = begin + (n & ~ (Py_ssize_t) 7); + p += 4; You may use STRINGLIB_CHAR in: + Py_UCS4 bits = p[0] \| p[1]	p[2]	p[3];
msg145376 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2011-10-11 23:50
Ok, updated patch.
msg145377 - (view)	Author: STINNER Victor (vstinner) *	Date: 2011-10-12 00:11
> Ok, updated patch. "ret = ~mask + 1;" looks wrong: (~0xFFFFFF80+1) gives 128, not 127. I don't see why you need: + if (ret < 128) + return 127; + if (ret < 256) + return 255; #undef ASCII_CHAR_MASK should be #undef UCS1_ASCII_CHAR_MASK #error Invalid STRINGLIB_SIZEOF_CHAR (must be 1, 2 or 4) should be #error Invalid STRINGLIB_SIZEOF_CHAR (must be 2 or 4) Why do you need these forward declarations? It's maybe related to another patch? static PyObject * +unicode_fromascii(const unsigned char s, Py_ssize_t size); +static PyObject +_PyUnicode_FromUCS1(const unsigned char s, Py_ssize_t size); +static PyObject +_PyUnicode_FromUCS2(const Py_UCS2 s, Py_ssize_t size); +static PyObject +_PyUnicode_FromUCS4(const Py_UCS4 *s, Py_ssize_t size); (You kept Py_UCS4 for "Py_UCS4 bits", but Py_UCS4 is maybe just fine.) By the way, your function rocks :-) Use bit masks is a great idea, especially your "bits = p[0] \| p[1]	p[2]	p[3]" "hack".
msg145378 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2011-10-12 00:24
> > Ok, updated patch. > > "ret = ~mask + 1;" looks wrong: (~0xFFFFFF80+1) gives 128, not 127. That's on purpose, since the mask has just matched. If 0xFFFFFF80 matches, then the max char can't be 127, it has to be at least 128. > I don't see why you need: > > + if (ret < 128) > + return 127; > + if (ret < 256) > + return 255; Because otherwise you're returning 0xFFFF in the following line: if (ret < 0x10000) return 0xFFFF; > #undef ASCII_CHAR_MASK should be #undef UCS1_ASCII_CHAR_MASK Ha, funny that gcc doesn't say anything. I also forgot to #under MASK_*. > #error Invalid STRINGLIB_SIZEOF_CHAR (must be 1, 2 or 4) > should be > #error Invalid STRINGLIB_SIZEOF_CHAR (must be 2 or 4) Well, no. 1 is a legal value, even though it's handled elsewhere, so the error message is right. > Why do you need these forward declarations? It's maybe related to another patch? It's because I'm moving the stringlib #includes up in unicodeobject.c, and stringlib calls these functions.
msg145427 - (view)	Author: STINNER Victor (vstinner) *	Date: 2011-10-12 20:21
Without the patch: python3.2 -m timeit 'x="é"+"x"10000' 'x[1:]' 100000 loops, best of 3: 2.18 usec per loop python3.3 -m timeit 'x="é"+"x"10000' 'x[1:]' 100000 loops, best of 3: 6.68 usec per loop
msg145428 - (view)	Author: STINNER Victor (vstinner) *	Date: 2011-10-12 20:22
With find_max_char4.patch: python3.3 -m timeit 'x="é"+"x"*10000' 'x[1:]' 100000 loops, best of 3: 1.96 usec per loop
msg145432 - (view)	Author: STINNER Victor (vstinner) *	Date: 2011-10-12 21:10
find_max_char5.patch: - don't use adjusted ~mask+1: "precompute" the right max_char - rename findwidth.h to find_max_char.h - add some #undef
msg145437 - (view)	Author: Roundup Robot (python-dev)	Date: 2011-10-12 22:08
New changeset 9c7d3207fc15 by Antoine Pitrou in branch 'default': Issue #13155: Optimize finding the optimal character width of an unicode string http://hg.python.org/cpython/rev/9c7d3207fc15
msg145438 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2011-10-12 22:08
> find_max_char5.patch: > - don't use adjusted ~mask+1: "precompute" the right max_char > - rename findwidth.h to find_max_char.h > - add some #undef Thank you, I've committed this version.

History
Date	User	Action	Args
2022-04-11 14:57:22	admin	set	github: 57364
2011-10-12 22:08:58	pitrou	set	status: open -> closedresolution: fixedmessages: + stage: resolved
2011-10-12 22:08:07	python-dev	set	nosy: + python-devmessages: +
2011-10-12 21:10:12	vstinner	set	files: + find_max_char5.patchmessages: +
2011-10-12 20:22:36	vstinner	set	messages: +
2011-10-12 20:21:30	vstinner	set	messages: +
2011-10-12 00:24:31	pitrou	set	messages: +
2011-10-12 00:11:44	vstinner	set	messages: +
2011-10-11 23:50:38	pitrou	set	files: + find_max_char4.patchmessages: +
2011-10-11 23:42:17	vstinner	set	nosy: + vstinnermessages: +
2011-10-11 23:32:19	pitrou	set	files: + find_max_char3.patchmessages: +
2011-10-11 23:22:36	pitrou	set	files: + find_max_char2.patchmessages: +
2011-10-11 23:14:18	pitrou	create