Issue 4680: deque class should include high-water mark (original) (raw)

Created on 2008-12-17 03:06 by roysmith, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Messages (9)
msg77955 - (view) Author: Roy Smith (roysmith) Date: 2008-12-17 03:06
It would be nice if Queue.Queue included a way to access the high-water mark, i.e. the largest value which qsize() has ever reached. This is often useful when assessing application performance. I am assuming this is cheap, i.e. O(1), to provide.
msg77976 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-12-17 17:47
It is probably easy enough to do so in a custom Queue subclass, and it's too specialized to put in the standard library IMHO. I'm closing the bug, another developer can reopen it if he thinks it is truely useful.
msg77979 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2008-12-17 17:59
I concur with Antoine.
msg77981 - (view) Author: Roy Smith (roysmith) Date: 2008-12-17 19:49
I'm suppose you could implement this in a subclass, but it would be inefficient. You'd have to over-ride put() and get(), call qsize(), then delegate to Base.put() and Base.get(). A cleaner solution would be in the C implementation of deque, in Modules/collectionsmodule.c. There's just a couple of places where queue->len gets incremented. All that needs to happen is add: if (queue->len > queue->high_water_mark) { queue->high_water_mark = queue->len; } after each one and then add the appropriate accessor functions in deque and Queue to expose it to users. The run-time cost is a couple of machine instructions for each item added to the deque. If I were to write the code and submit it, would you be willing to accept it?
msg77986 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2008-12-17 21:33
Will think about it for a bit. My initial inclination is against because there have been no other such requests to instrument all the fundamental data types, because it's not hard to add instrumentation using your own pure python additions around size-mutating calls, because it clutters the API, and because it's unclear whether a maximum-size statistic has much utility. For deques, should the clear() method zero-out the high-water mark? How about the equivalent code where the contents are all popped-off (down to zero size) before the structure is re-used?
msg77989 - (view) Author: Roy Smith (roysmith) Date: 2008-12-17 22:05
I'm not actually sure what the use case is for clear(). It's easy enough to just create a new deque. If you can do that, why do you need clear()? Since I don't ever see a reason anybody would want to call clear(), I'm not 100% if it should reset the high-water mark or not. I don't *think* it should, but I'm willing to believe that a use case could exist which would make me change my mind about that. Popping all the elements off the deque certainly should *not* reset the high water mark. The whole point of tracking the high water mark is to know if the consumer thread ever fell behind the producer thread (yes, I realize queues could be used for non-threading purposes). It's perfectly normal for the queue to drain completely at times, and there's absolutely no reason this should affect the high-water mark. You do raise a good question about whether all of the standard containers should be instrumented. I don't know the answer to that. Queues and, say, dicts are different beasts. It's common to use queues where the putting-in and taking-out are asynchronous, and thus the high-water mark is an indicator of overall application health. I don't see that for dicts, or lists (well, maybe if used as a stack) or most other kinds of containers.
msg77998 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2008-12-17 23:53
FWIW, here's a trivial Queue subclass with instrumentation: class HighWaterQueue(Queue): def _init(self, maxsize): Queue.__init__(self, maxsize) self.highwater = 0 def _put(self, item): Queue._put(self, item) self.highwater = max(self.highwater, self._qsize())
msg78008 - (view) Author: Roy Smith (roysmith) Date: 2008-12-18 01:53
And, FWIW, I did figure out a use case for clear(). I create a queue and pass it to two threads. One side or the other decides to abandon processing of the events currently in the queue. I can't just create a new queue, because you have no way to tell the other thread about it. You need to have clear() to do this. And, no, it should not clear the high water mark. As I see it, it comes down to this: If you bury this in the C code inside deque(), it's very efficient compared to the Python wrapper class. The downside is it makes the API larger than it would otherwise be, to satisfy a use case with limited demand. If you feel the efficiency gain doesn't justify the added complexity in the API, I'm OK with that. I just didn't want this shot down on the basis of, "He's asking us to invest the effort to write the code for something we don't see a need for", hence the offer to write it myself. But, it's your call if you want it or not.
msg78009 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2008-12-18 02:04
There's no need to keep asking -- this report was already rejected ;-) Seriously, the efficiency argument carries no weight with me -- in 15 years of using Queue in a variety of applications, the time it takes to .put and .get "here's a piece of work to do" descriptors is trivial compared to the cost of /doing/ the "piece of work". A Python-coded subclass would have been plenty fast enough in any of those apps (assuming I ever wanted to track a highwater mark, which to date I never have).
History
Date User Action Args
2022-04-11 14:56:42 admin set github: 48930
2008-12-18 02:04:40 tim.peters set nosy: + tim.petersmessages: +
2008-12-18 01:53:09 roysmith set messages: +
2008-12-17 23:53:33 rhettinger set messages: +
2008-12-17 22:05:45 roysmith set messages: +
2008-12-17 21:33:42 rhettinger set assignee: rhettingermessages: + title: Queue class should include high-water mark -> deque class should include high-water mark
2008-12-17 19:49:51 roysmith set messages: +
2008-12-17 17:59:12 rhettinger set nosy: + rhettingermessages: +
2008-12-17 17:47:59 pitrou set status: open -> closedresolution: rejectedmessages: + nosy: + pitrou
2008-12-17 06:04:25 loewis set versions: + Python 2.7, - Python 2.5.3
2008-12-17 03:29:38 LambertDW set nosy: + LambertDW
2008-12-17 03:06:39 roysmith create