Issue 11875: OrderedDict.reduce not threadsafe (original) (raw)

Created on 2011-04-19 08:28 by mjuric, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Messages (10)
msg134023 - (view) Author: Mario Juric (mjuric) Date: 2011-04-19 08:28
The implementation of OrderedDict.__reduce__() in Python 2.7.1 is not thread safe because of the following four lines: tmp = self.__map, self.__root del self.__map, self.__root inst_dict = vars(self).copy() self.__map, self.__root = tmp If one thread is pickling an OrderedDict, while another accesses it, a race condition occurs if the accessing thread accesses the dict after self.__map and self.__root have been delated, and before they've been set again (above). This leads to an extremely difficult bug to diagnose when using multiprocessing.Queue to exchange OrderedDicts between multiple processes (because Queue uses a separate feeder thread to do the pickling). The fix seems relatively easy -- use: inst_dict = vars(self).copy() del inst_dict['_OrderedDict__map'], inst_dict['_OrderedDict__root'] instead of the above four lines. PS: This issue+fix may also apply to Python 3.x, although I haven't tested it there.
msg134027 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2011-04-19 09:34
To avoid hardcoding the mangled names: @@ -155,10 +155,9 @@ def __reduce__(self): 'Return state information for pickling' items = [[k, self[k]] for k in self] - tmp = self.__map, self.__root, self.__hardroot - del self.__map, self.__root, self.__hardroot inst_dict = vars(self).copy() - self.__map, self.__root, self.__hardroot = tmp + for k in vars(OrderedDict()): + inst_dict.pop(k, None) if inst_dict: return (self.__class__, (items,), inst_dict) return self.__class__, (items,)
msg134079 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2011-04-19 16:54
New changeset 3150b6939731 by Raymond Hettinger in branch '2.7': Issue 11875: Keep OrderedDict's __reduce__ from temporarily mutating the object. http://hg.python.org/cpython/rev/3150b6939731
msg134085 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2011-04-19 18:15
New changeset 50a89739836f by Raymond Hettinger in branch '3.1': Issue 11875: Keep OrderedDict's __reduce__ from temporarily mutating the object. http://hg.python.org/cpython/rev/50a89739836f New changeset a7ac7a7c8c78 by Raymond Hettinger in branch '3.2': Issue 11875: Keep OrderedDict's __reduce__ from temporarily mutating the object. http://hg.python.org/cpython/rev/a7ac7a7c8c78 New changeset 928f17923b0d by Raymond Hettinger in branch 'default': Issue 11875: Keep OrderedDict's __reduce__ from temporarily mutating the object. http://hg.python.org/cpython/rev/928f17923b0d New changeset 937385024601 by Raymond Hettinger in branch '3.2': Issue 11875: Keep OrderedDict's __reduce__ from temporarily mutating the object. http://hg.python.org/cpython/rev/937385024601
msg134089 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2011-04-19 19:08
Mario, I removed the unnecessary mutation, but remember that OrderedDict's are not thread-safe in general. Anything that updates an OrderedDict will temporarily leave it in an inconsistent state while the links and mappings are being updated. That being said, it isn't hard to create a subclass with a lock around each of the mutating methods: setitem, delitem, clear, popitem, pop, move_to_end, setdefault, init, and update.
msg134096 - (view) Author: Mario Juric (mjuric) Date: 2011-04-19 20:26
Hi Raymond, Excellent! Many thanks for such a quick fix, this has been bugging us for months!
msg134099 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2011-04-19 20:58
The call to self.__class__() can break subclasses of OrderedDict for two reasons: - The subclass constructor may have a different signature - Attributes set by the subclass.__init__ are removed from the pickle:: import collections, pickle class Mydict(collections.OrderedDict): def __init__(self, *args, name=None, **kwargs): super().__init__(*args, **kwargs) self.name = name a = Mydict(name='foo') b = pickle.loads(pickle.dumps(a)) print(a.name, b.name) Previously, the 'name' would be correctly copied, now it is reset to None.
msg134109 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2011-04-20 00:01
New changeset db66eaf353a6 by Raymond Hettinger in branch '2.7': Issue #11875: Alter the previous fix to work better with subclasses http://hg.python.org/cpython/rev/db66eaf353a6
msg134110 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2011-04-20 00:19
New changeset 408f23b6cec5 by Raymond Hettinger in branch '3.1': Issue #11875: Alter the previous fix to work better with subclasses http://hg.python.org/cpython/rev/408f23b6cec5 New changeset 4e6840477d96 by Raymond Hettinger in branch '3.2': Issue #11875: Alter the previous fix to work better with subclasses http://hg.python.org/cpython/rev/4e6840477d96 New changeset 64968d226b61 by Raymond Hettinger in branch 'default': Issue #11875: Alter the previous fix to work better with subclasses http://hg.python.org/cpython/rev/64968d226b61
msg134125 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2011-04-20 06:39
Mario, thanks for the bug report.
History
Date User Action Args
2022-04-11 14:57:16 admin set github: 56084
2011-04-20 06:39:51 rhettinger set status: open -> closedmessages: +
2011-04-20 00:19:26 python-dev set messages: +
2011-04-20 00:01:12 python-dev set messages: +
2011-04-19 21:08:45 daniel.urban set nosy: + daniel.urban
2011-04-19 20:58:09 amaury.forgeotdarc set status: closed -> opennosy: + amaury.forgeotdarcmessages: +
2011-04-19 20:26:37 mjuric set messages: +
2011-04-19 19:08:53 rhettinger set status: open -> closedresolution: fixedmessages: +
2011-04-19 18:15:34 python-dev set messages: +
2011-04-19 16:55:00 python-dev set nosy: + python-devmessages: +
2011-04-19 09:34:04 rhettinger set messages: +
2011-04-19 08:46:24 rhettinger set assignee: rhettingernosy: + rhettinger
2011-04-19 08:28:24 mjuric create