Issue 14206: multiprocessing.Queue documentation is lacking important details (original) (raw)
If cancel_join_thread() is called, data may be lost. This is not explicitly stated. I had multiple writers put() data in a Queue, and wanted to have the workers finish before I began consuming the data. This caused a deadlock because my Queue was not empty, and it seemed like the a way to force my workers finish was to use cancel_join_thread(). This caused data loss.
multiprocessing.Queue states "The Queue class is a near clone of Queue.Queue."
Queue.Queue states "If maxsize is less than or equal to zero, the queue size is infinite."
mp.Queue provides no information on queue size. It is reasonable to assume then that it inherits the property of Queue.Queue.
After discussion on IRC, it seems that mp.Queue maximum size is implementation-dependent and likely relies on how much data Pipes can hold on your platform. If this is the case there should be some mention of the fact that mp.Queue does NOT function like Queue.Queue does for maximum size.
What you were told on IRC was wrong. By default the queue does have infinite size.
When a process puts an item on the queue for the first time, a background thread is started which is responsible for writing items to the underlying pipe. This does mean that, on exit, the process should wait for the background thread to flush all the data to the pipe. This happens automatically unless you specifically prevent it by calling cancel_join_thread() method.
If you stick to those methods supported by standard queue objects, then things should work correctly.
(Maybe cancel_join_thread() would be better named allow_exit_without_flush().)