msg150927 - (view) |
Author: Simon Sapin (ssapin) |
Date: 2012-01-09 10:35 |
Hi, The attached patch adds a 'key' optional parameter to the heapq.merge function that behaves as in sorted(). Related discussion: http://mail.python.org/pipermail/python-ideas/2012-January/013295.html This is my first contribution to CPython. |
|
|
msg150928 - (view) |
Author: Simon Sapin (ssapin) |
Date: 2012-01-09 10:43 |
The attached script benchmarks the basline (current implementation) against 3 new implementations, as suggested on http://mail.python.org/pipermail/python-ideas/2012-January/013296.html On my machine, the output is: merge_baseline per run, min of 3 = 7.527 ms merge_1 per run, min of 3 = 9.894 ms 131.449 % of baseline merge_2 per run, min of 3 = 7.948 ms 105.594 % of baseline merge_3 per run, min of 3 = 7.581 ms 100.716 % of baseline On this particular input, merge_2 adds 6% of overhead when the key parameter is not used. While merge_3 only adds 1% of overhead, it almost doubles the amount of code. (Which was admittedly not that long to begin with.) The patch in the previous message is with the merge_2 implementation, which seemed like the best compromise to me. |
|
|
msg150931 - (view) |
Author: Simon Sapin (ssapin) |
Date: 2012-01-09 11:10 |
Oops, the patch to the documentation would also need 'New in 3.3: the key parameter', with the right Sphinx directive. But that depends on whether this change ends up in 3.3 or 3.4. Does 3.3 still get new features? |
|
|
msg150954 - (view) |
Author: Éric Araujo (eric.araujo) *  |
Date: 2012-01-09 16:53 |
Yes, 3.3 is still in the early development stage, and new features will be accepted until the first beta (in June, see PEP 398). “.. versionadded:: 3.3 The *key* parameter” will do. |
|
|
msg150969 - (view) |
Author: Raymond Hettinger (rhettinger) *  |
Date: 2012-01-09 19:33 |
Simon, please keep the original version fast by creating two code paths: if key is None: original_code else: new_code using the key_function |
|
|
msg150983 - (view) |
Author: Simon Sapin (ssapin) |
Date: 2012-01-09 22:13 |
Raymond, please have a look at merge_3 in benchmark_heapq_merge.py. It is implemented as you say. Do you think the speed is worth the code duplication? |
|
|
msg151369 - (view) |
Author: Simon Sapin (ssapin) |
Date: 2012-01-16 14:58 |
heapq_merge_key_duplicate.patch is a new patch with two code path. It also updates the function’s docstring (which the previous patch did not). Raymond, do you think the speed is worth the DRY violation? |
|
|
msg152802 - (view) |
Author: Raymond Hettinger (rhettinger) *  |
Date: 2012-02-07 04:11 |
I'll look at this in the next couple of weeks. Hang tight :-) |
|
|
msg152984 - (view) |
Author: Terry J. Reedy (terry.reedy) *  |
Date: 2012-02-09 20:39 |
FWIW, Guido approves of the idea, in #4356 |
|
|
msg168070 - (view) |
Author: Simon Sapin (ssapin) |
Date: 2012-08-13 07:51 |
I just remembered about this. I suppose it is too late for 3.3? |
|
|
msg168116 - (view) |
Author: Éric Araujo (eric.araujo) *  |
Date: 2012-08-13 15:20 |
Yes, 3.3 is already in beta. |
|
|
msg185259 - (view) |
Author: Raymond Hettinger (rhettinger) *  |
Date: 2013-03-26 06:54 |
Attaching a rough draft implementation for a fully encapsulated Heap() class that is thread-safe, supports minheaps and maxheaps, and efficiently implements key-functions (called no more than once per key). |
|
|
msg188066 - (view) |
Author: Mark Dickinson (mark.dickinson) *  |
Date: 2013-04-29 12:14 |
heap2.diff contains only a single line's change. Wrong file attached? |
|
|
msg188067 - (view) |
Author: Mark Dickinson (mark.dickinson) *  |
Date: 2013-04-29 12:14 |
Ah, I see the new file now (I'd failed to refresh my browser); sorry for the noise. |
|
|
msg188080 - (view) |
Author: Mark Dickinson (mark.dickinson) *  |
Date: 2013-04-29 17:51 |
Looks pretty good to me. - There's a bonus print call in the diff. - Should the "len(self._data)" call be protected by the lock? I can't immediately think of any reason why that would be necessary (e.g., pushpop nd poppush never change the size of self._data, so there's no risk of getting a bogus length there), but the lack of the lock makes me nervous. - Support for iter() seems a bit out of place to me. What are the use-cases for this? Would it make sense to leave this out (for now)? |
|
|
msg188085 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2013-04-29 18:55 |
There is already one heap class in the stdlib: queue.PriorityQueue. Why create a duplicate instead extend queue.PriorityQueue with desired features? May be name the maxheap parameter as reverse? |
|
|
msg219380 - (view) |
Author: Roundup Robot (python-dev)  |
Date: 2014-05-30 09:28 |
New changeset f5521f5dec4a by Raymond Hettinger in branch 'default': Issue #13742: Add key and reverse parameters to heapq.merge() http://hg.python.org/cpython/rev/f5521f5dec4a |
|
|
msg233792 - (view) |
Author: Tommy Carstensen (Tommy.Carstensen) |
Date: 2015-01-10 01:19 |
I noticed 3.5 alpha1 is not released until February 1st. Is there any way I can get my hands on this new functionality? |
|
|
msg233793 - (view) |
Author: Berker Peksag (berker.peksag) *  |
Date: 2015-01-10 01:24 |
Hi Tommy, the patch is already committed to Python 3.5. See https://docs.python.org/3.5/library/heapq.html#heapq.merge |
|
|
msg233795 - (view) |
Author: Tommy Carstensen (Tommy.Carstensen) |
Date: 2015-01-10 01:55 |
Yes, but 3.5 has not been pre-released yet. |
|
|
msg233804 - (view) |
Author: Terry J. Reedy (terry.reedy) *  |
Date: 2015-01-10 08:47 |
You can set up mecurial on your machine, make a read-only clone of the cpython repository, and compile it just as do other people, whether core-developers or otherwise. See docs.python.org/devguide for details. |
|
|