[Python-Dev] Python3 regret about deleting list.sort(cmp=...) (original) (raw)
Terry Reedy tjreedy at udel.edu
Sat Mar 12 23:09:39 CET 2011
- Previous message: [Python-Dev] Python3 regret about deleting list.sort(cmp=...)
- Next message: [Python-Dev] Python3 regret about deleting list.sort(cmp=...)
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On 3/12/2011 3:44 PM, Guido van Rossum wrote:
I was just reminded that in Python 3, list.sort() and sorted() no longer support the cmp (comparator) function argument. The reason is that the key function argument is always better. But now I have a nagging doubt about this:
I recently advised a Googler who was sorting a large dataset and running out of memory. My analysis of the situation was that he was sorting a huge list of short lines of the form "shortstring,integer" with a key function that returned a tuple of the form ("shortstring", integer).
I believe that if the integer field were padded with leading blanks as needed so that all are the same length, then no key would be needed.
ll = ['a,11111', 'ab, 3', 'a, 1', 'a, 111'] ll.sort() print(ll)
['a, 1', 'a, 111', 'a,11111', 'ab, 3']
If most ints are near the max len, this would add little space, and be even faster than with the key.
Using the key function argument, in addition to N short string objects, this creates N tuples of length 2, N more slightly shorter string objects, and N integer objects. (Not to count a parallel array of N more pointers.) Given the object overhead, this dramatically increased the memory usage. It so happens that in this particular Googler's situation, memory is constrained but CPU time is not, and it would be better to parse the strings over and over again in a comparator function.
Was 3.2 used? It has a patch that reduces the extra memory that might not be in the last 3.1 release.
But in Python 3 this solution is no longer available. How bad is that? I'm not sure. But I'd like to at least get the issue out in the open.
This removal has been one of the more contentious issues about (not) using 3.x. I believe Raymond had been more involved in the defense of the decision than I. However, the discussion/complaint has mostly been about the relative difficulty of writing a key function versus a compare function.
-- Terry Jan Reedy
- Previous message: [Python-Dev] Python3 regret about deleting list.sort(cmp=...)
- Next message: [Python-Dev] Python3 regret about deleting list.sort(cmp=...)
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]