[Python-Dev] memcmp performance (original) (raw)
Victor Stinner victor.stinner at haypocalc.com
Tue Oct 25 11:27:36 CEST 2011
- Previous message: [Python-Dev] memcmp performance
- Next message: [Python-Dev] [Python-checkins] cpython: #13251: update string description in datamodel.rst.
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Le Mardi 25 Octobre 2011 10:44:16 Stefan Behnel a écrit :
Richard Saunders, 25.10.2011 01:17: > -On [20111024 09:22], Stefan Behnel wrote: > >>I agree. Given that the analysis shows that the libc memcmp() is > >>particularly fast on many Linux systems, it should be up to the > >>Python package maintainers for these systems to set that option > >>externally through the optimisation CFLAGS. > > Indeed, this is how I constructed my Python 3.3 and Python 2.7 : > setenv CFLAGS '-fno-builtin-memcmp' > just before I configured. > > I would like to revisit changing unicodecompare: adding a > special arm for using memcmp when the "unicode kinds" are the > same will only work in two specific instances: > > (1) the strings are the same kind, the char size is 1 > * We could add THIS to unicodecompare, but it seems extremely > specialized by itself
But also extremely likely to happen. This means that the strings are pure ASCII, which is highly likely and one of the main reasons why the unicode string layout was rewritten for CPython 3.3. It allows CPython to save a lot of memory (thus clearly proving how likely this case is!), and it would also allow it to do faster comparisons for these strings.
Python 3.3 has already some optimizations for latin1: CPU and the C language are more efficient to process char* strings than Py_UCS2 and Py_UCS4 strings. For example, we are using memchr() to search a single character is a latin1 string.
Victor
- Previous message: [Python-Dev] memcmp performance
- Next message: [Python-Dev] [Python-checkins] cpython: #13251: update string description in datamodel.rst.
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]