Make max heap functions public in heapq (original) (raw)
The heapq module contains some private max-heap variants of its heap functions: _heapify_max
, _heappop_max
, _heapreplace_max
. This exist to support the higher-level functions like merge()
. I’d like the _max
variants to be made public (remove the underscore prefix), and documented. This will make it easy for users to create and manipulate max heaps as well.
The current public heapq functions only work with min-heaps. Using them to create a max-heap requires either storing inverted values (i.e. -x
, which only works for numbers), or using some kind of wrapper class to invert comparisons.
Several such solutions are discussed in the 12 year old Stack Overflow question What do I use for a max-heap implementation in Python?. The second most popular solution is to use the private *_max
variant functions, which I would like to make public.
(inspired by looking at the MinHeap problem on @trey 's Python Morsels)
guido (Guido van Rossum) June 30, 2022, 4:10pm 2
@rhettinger What do you think? Looks like a sensible request. If there’s a good reason not to do this it would be nice to have it written up for posterity.
storchaka (Serhiy Storchaka) July 1, 2022, 5:07am 3
It was proposed and rejected several times.
Two of those (non-)discussions were just dismissed on the basis that it has been rejected in the past. The first request was rejected on the basis that people haven’t needed it, but if you do, just negate the values.
If it is as simple as just negating your (numeric only) values, then why do the heapq have max-heap code for internal use? Why doesn’t it just negate the values?
The issue keeps coming up on the bug tracker, and on Stackoverflow:
and on PyPI, blog posts and more:
https://www.bing.com/search?q=python+maxheap
If the max heap code didn’t already exist, I could understand the reluctance to add it, but it does exist and it is already in use.
CAM-Gerlach (C.A.M. Gerlach) July 2, 2022, 2:35pm 5
Indeed. I note it was mentioned by both the reporter and the responding developer on subsequent reports that this was brought up many times in the past and rejected, but the rationale given on the very first issue for rejecting it was that there hadn’t been “significant demonstrated need”, to which the original user did reply but was never responded to.
Certainly, there may be a compelling reason for rejecting this, but if so it should at least be stated and publicly documented, given the amount of interest and requests.
guido (Guido van Rossum) July 2, 2022, 7:10pm 6
The next steps are to create an issue and a PR. Please link to both here.
rhettinger (Raymond Hettinger) July 9, 2022, 6:32am 7
We can do this. Much of the code is already there.
Mostly when this has come up before, it was usually a curiosity question, “I see a misheap, why isn’t there a maxheap?”. There wasn’t much in the way of actual needs other than being given this as a homework problem or coding challenge. That said, I’ve had some use cases and would like it if the functionality were exposed.
I’ll work on it soonish.
coopers (Chris Cooper) September 29, 2023, 12:58am 8
Does anyone want to see a max heap version of heappush?
def _heappush_max(heap, item):
"""Maxheap version of a heappush."""
heap.append(item)
_siftdown_max(heap, 0, len(heap)-1)
rhettinger (Raymond Hettinger) September 29, 2023, 2:39am 9
I have a PR in process for making public maxheap functions across the board. It will go in for the next version of Python. It’s too late for 3.12.
tczajka (Tomek Czajka) February 11, 2024, 12:27pm 10
Instead of creating a max version, wouldn’t it make more sense to add a reverse=
and key=
optional arguments, like you have with sort
? This would allow you to have heaps based on any ordering you want.
blhsing (Ben Hsing) February 20, 2024, 4:26am 11
A good use case for a max heap is mentioned in heapq
’s documentation itself:
As far as I know, in-place sorting with a max heap is the only way to sort an arbitrary list with O(n log n) time complexity and O(1) space complexity.
I recently posted a fairly lengthy answer to an old StackOverflow question with all the hoop-jumping workarounds only because the max heap functions aren’t exposed as public functions from heapq
:
But on second thought, if in-place sorting is the only good use case for a max heap, then maybe instead of making max heap functions public we can make in-place sorting itself a public function in heapq
.
EklipZgit (Travis Drake) March 28, 2024, 8:51pm 12
Hey, I just want to point out that this seems like a big gap. “Just invert the values negatively” isn’t a great answer when you’re sorting complex objects with custom comparitors. What, I’m just supposed to invert my gt and lt methods in an expensive wrapper class? What if I’m trying to use the objects I put in the heap for other purposes? Other algorithms use heaps. What if I’m writing an A*-like heuristic function (actually, I have a lot of these, hundreds and hundreds of lines of code of tens of implementations of variations on these for game AI) and need to maximize a heuristic? I have to take my heuristic val and invert it before putting it in the heap, resulting in a ton of overly complex (and computationally wasteful) conversion code, just to get the value back out and invert it again before doing all the other value comparisons that need to be done. I’m surprised more people aren’t complaining, I guess not that many people write complex heuristic searches in python?
I got sick of the overly complex and obtuse negation in my value functions (over 2x the amount of code is necessary because of missing max-heap, I need a heuristic func that produces the actual values for other things to use, and then a separate heuristic func that chains the output from the value func and inverts each property individually for the heap-based search), and started switching to use heapq_max · PyPI instead for much, much cleaner code for my (many) use cases where the search heuristic and value heuristic are the same, but it is so old (or inefficiently compiled?) that the performance is about 100% worse than the min-heap in heapq. I’ve tried heap-class · PyPI as well but its performance is even worse than that, for both min and max heaps.
I’m using python because I dont want to write C (nor figure out how to get my C compiled as fast as heapq is, as well as figuring out all the interop stuff…)
Please, just give us high performance max-heap methods from heapq it seems so strange for such an important core data structure from core algorithms to be completely missing.
dumbpotato (Shashwat) August 1, 2024, 1:33pm 13
So has there been progress on this? Tried searching for the PR but only found this which was closed by @rhettinger saying he’ll raise another PR.
Also, perhaps I missed the link in the thread, but could someone link me to the Github issue for this(if it exists). Thanks!
prashanthbanda (Prashanth Banda) October 16, 2024, 6:50pm 14
Python 3.13 is released but there is no mention of max heap in it.