Issue 22955: Pickling of methodcaller, attrgetter, and itemgetter (original) (raw)

Created on 2014-11-27 06:33 by Antony.Lee, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
pickle_getter_and_caller.patch josh.r,2014-11-29 04:06 review
pickle_getter_and_caller2.patch josh.r,2014-11-29 04:32 review
issue22955.diff zach.ware,2014-11-29 22:05 josh.r's patch with itemgetter and attrgetter reimplementations review
pickle_getter_and_caller3.patch serhiy.storchaka,2014-12-14 16:49 review
pickle_getter_and_caller4.patch serhiy.storchaka,2015-05-16 20:12 review
Messages (23)
msg231752 - (view) Author: Antony Lee (Antony.Lee) * Date: 2014-11-27 06:33
methodcaller and attrgetter objects seem to be picklable, but in fact the pickling is erroneous: >>> import operator, pickle >>> pickle.loads(pickle.dumps(operator.methodcaller("foo"))) Traceback (most recent call last): File "", line 1, in TypeError: methodcaller needs at least one argument, the method name >>> pickle.loads(pickle.dumps(operator.attrgetter("foo"))) Traceback (most recent call last): File "", line 1, in TypeError: attrgetter expected 1 arguments, got 0 When looking at the pickle disassembly, it seems that the argument to the constructor is indeed not pickled. >>> import pickletools; pickletools.dis(pickle.dumps(operator.methodcaller("foo"))) 0: \x80 PROTO 3 2: c GLOBAL 'operator methodcaller' 25: q BINPUT 0 27: ) EMPTY_TUPLE 28: \x81 NEWOBJ 29: q BINPUT 1 31: . STOP highest protocol among opcodes = 2
msg231768 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-11-27 16:19
I think this issue needs different solutions for 3.5 and maintained releases. We can implement the pickling of methodcaller, attrgetter and itemgetter in 3.5 (I agree this is good idea). And it would be good if pickling of these types will raise an exception in maintained releases.
msg231831 - (view) Author: Zachary Ware (zach.ware) * (Python committer) Date: 2014-11-28 21:33
Note that pickling of the pure Python version of methodcaller works as expected: Python 3.4.2 (default, Nov 20 2014, 12:40:10) [GCC 4.8.3] on linux Type "help", "copyright", "credits" or "license" for more information. >>> import sys >>> sys.modules['_operator'] = None >>> import operator >>> import pickle >>> pickle.loads(pickle.dumps(operator.methodcaller('foo'))) <operator.methodcaller object at 0x7ff869945898> The pure Python attrgetter and itemgetter don't work due to using functions defined in __init__(). 2.7 already raises TypeError on attempts to pickle any of the three.
msg231841 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2014-11-29 00:12
+1 for adding pickling support to Python 3.5. I don't see much of a need for any revision to 3.4.
msg231848 - (view) Author: Josh Rosenberg (josh.r) * (Python triager) Date: 2014-11-29 02:57
I've made a patch that I believe should cover all three cases, including tests. In addition to the pickling behavior, I've made two other changes: 1. methodcaller verifies during construction that the name is a string (PyUnicode), and interns it; attrgetter did this already, and I tweaked methodcaller to match for correctness and performance reasons 2. I added proper repr functionality to all three objects. Partially this is just to make it look nicer, but it was also a decent way to spot verify that the pickle/unpickle sequence behaved correctly Anyone care to review?
msg231849 - (view) Author: Josh Rosenberg (josh.r) * (Python triager) Date: 2014-11-29 03:26
Don't bother reviewing just yet. There is an issue with attrgetter's pickling (which the unit tests caught), and I need to update the pure Python modules to match.
msg231850 - (view) Author: Josh Rosenberg (josh.r) * (Python triager) Date: 2014-11-29 04:06
Okay, this one passes the tests for the built-in module. I'm not sure what's going wrong with the pure Python module. I'm getting the error: _pickle.PicklingError: Can't pickle <class 'operator.attrgetter'>: it's not the same object as operator.attrgetter once for each of the three objects. Anyone recognize this? Is this some weird artifact of the multiple imports required to test both pure Python and C versions of the module that I need to work around, or did I make a mistake somewhere else?
msg231851 - (view) Author: Josh Rosenberg (josh.r) * (Python triager) Date: 2014-11-29 04:32
Ah, solved it (I think). The bootstrapper used to import the Python and C versions of the module leaves sys.modules unpopulated (Does pickle itself may populate it when it finds no module of that name?). I added a setUp method to the unittest class for operator that explicitly sets sys.modules['operator'] to whichever version is being tested at the time so pickle's lookup works as expected. Is that the right solution? New patch uploaded with that change.
msg231872 - (view) Author: Zachary Ware (zach.ware) * (Python committer) Date: 2014-11-29 21:52
I'd prefer to just reimplement itemgetter and attrgetter to make them picklable rather than adding pickling methods to them; see attached patch. I also posted a few comments, but I just went ahead and addressed them myself in this patch. I'm not qualified to give the _operator.c changes a proper review, but they look good enough to me if others agree that __reduce__ is the best approach in C.
msg231873 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-11-29 22:10
operator.methodcaller is similar to functools.partial which is pickleable and can be used as a sample. In C implementation some code can be shared between __repr__ and __reduce__ methods. As for tests, different protocols should be tested. Also should be tested compatibility between C and Python implementations, instances pickled with one implementation should be unpickleable with other implementation. Move pickle tests into new test class. If add __repr__ methods, they need tests. The restriction of method name type should be tested too.
msg232363 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2014-12-09 10:35
> I'd prefer to just reimplement itemgetter and attrgetter to make > them picklable rather than adding pickling methods to them; > see attached patch. That isn't the usual approach. The pickling methods are there for a reason. I prefer to leave the existing code in a stable state and avoid unnecessary code churn or risk introducing bugs into code that is working correctly and as designed.
msg232364 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2014-12-09 10:40
Please remember that a potential new pickling feature is the least import part of the design of methodcaller, itemgetter, and attrgetter. Pickle support should be driven by the design rather become a predominant consideration. One other note: the OP's original concern has very little to do with these particular objects. Instead, it is the picking and unpickling tools themselves that tend to have crummy error messages when presented with objects that weren't specially designed with pickle support.
msg232370 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-12-09 12:01
> Instead, it is the picking and unpickling tools themselves that tend to have crummy error messages when presented with objects that weren't specially designed with pickle support. See about this.
msg232512 - (view) Author: Zachary Ware (zach.ware) * (Python committer) Date: 2014-12-12 06:36
Serhiy: functools.partial is a somewhat less than ideal comparison. The pure-Python version is not picklable, the Python and C versions return different things (the Python version is a function returning a function, the C version is a regular class and returns an instance). Also, both versions make their necessary attributes public anyway, unlike methodcaller. Raymond: Not necessarily the usual approach, no. However, I think my reimplementations of the pure-Python itemgetter and attrgetter have a few benefits, namely: - they're somewhat less complex and thus a bit easier to understand - they're slightly faster - they don't require extra pickling methods, which to me just seem like clutter when it's so simple to not need them Note that I have no intention of reimplementing the C versions: those are much more mature than the Python versions, and would likely require pickling methods anyway. All that said, I'm not going to fight about it; if I'm overruled, I'm overruled. Josh: Serhiy's points about needing more tests stand; would you like to add them? You can use your patch or mine as a base, depending on how you feel about reimplementing the pure-Python (item|attr}getter. If you use yours, please remember to look through my comments on it.
msg232616 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-12-13 17:54
> functools.partial is a somewhat less than ideal comparison. The pure-Python version is not picklable, the Python and C versions return different things (the Python version is a function returning a function, the C version is a regular class and returns an instance). Looks as Python version of functools.partial() needs a fix. Reimplementations of the pure-Python itemgetter and attrgetter to automatically pickleable Python classes have a disadvantage. It makes the pickling incompatible between Python and C versions. This means that itemgetter pickled in CPython will be not unpickleable on Python implementation which don't use C accelerator and vice versa.
msg232617 - (view) Author: Zachary Ware (zach.ware) * (Python committer) Date: 2014-12-13 17:58
Serhiy Storchaka added the comment: > Reimplementations of the pure-Python itemgetter and attrgetter to > automatically pickleable Python classes have a disadvantage. It makes > the pickling incompatible between Python and C versions. This means > that itemgetter pickled in CPython will be not unpickleable on Python > implementation which don't use C accelerator and vice versa. That's a very good point that I hadn't thought about. Consider my patch withdrawn.
msg232641 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-12-14 16:49
Here is revised Josh's patch. Added tests for consistency between both implementations, fixed inconsistencies and bugs. I still hesitate about pickling format of methodcaller. First, there is asymmetry between positional and keyword arguments. Second, for now methodcaller is not inheritable, but if it will be in future (as functools.partial is), it would be harder to extend pickling format to support instance attributes.
msg243364 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015-05-16 20:12
methodcaller with keyword arguments pickled with pickle_getter_and_caller3.patch needs Python 3.5 to unpickle. Following patch pickles it in backward compatible way.
msg243676 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2015-05-20 15:29
New changeset 435bc22f39e3 by Serhiy Storchaka in branch 'default': Issue #22955: attrgetter, itemgetter and methodcaller objects in the operator https://hg.python.org/cpython/rev/435bc22f39e3
msg243687 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2015-05-20 19:03
New changeset c93e5ba1cc20 by Serhiy Storchaka in branch 'default': Issue #22955: Fixed test_operator. It left Python implementation in https://hg.python.org/cpython/rev/c93e5ba1cc20
msg243746 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2015-05-21 11:20
New changeset 2688655e431a by Serhiy Storchaka in branch 'default': Issue #22955: Fixed reference leak in attrgetter.repr(). https://hg.python.org/cpython/rev/2688655e431a
msg265718 - (view) Author: Jason Curtis (jason.curtis) * Date: 2016-05-16 18:59
This is still an issue with operator.attrgetter in 3.4.3, even after clearing sys.modules['_operator']: $ python3 Python 3.4.3 (default, Oct 14 2015, 20:28:29) [GCC 4.8.4] on linux Type "help", "copyright", "credits" or "license" for more information. >>> import sys >>> sys.modules['_operator'] = None >>> import operator >>> import pickle >>> pickle.loads(pickle.dumps(operator.attrgetter("foo"))) Traceback (most recent call last): File "", line 1, in _pickle.PicklingError: Can't pickle <function attrgetter.__init__..func at 0x7f25728d5bf8>: attribute lookup func on operator failed
msg265727 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-05-16 19:28
This is new feature in 3.5.
History
Date User Action Args
2022-04-11 14:58:10 admin set github: 67144
2016-05-16 19:28:48 serhiy.storchaka set messages: + versions: - Python 3.4
2016-05-16 18:59:43 jason.curtis set nosy: + jason.curtismessages: + versions: + Python 3.4
2015-05-27 08:49:19 serhiy.storchaka set status: open -> closedresolution: fixedstage: patch review -> resolved
2015-05-21 11:20:05 python-dev set messages: +
2015-05-20 19:03:10 python-dev set messages: +
2015-05-20 15:29:47 python-dev set nosy: + python-devmessages: +
2015-05-16 20:12:07 serhiy.storchaka set files: + pickle_getter_and_caller4.patchmessages: +
2014-12-14 16:49:30 serhiy.storchaka set files: + pickle_getter_and_caller3.patchmessages: + stage: needs patch -> patch review
2014-12-13 17:58:13 zach.ware set messages: +
2014-12-13 17:54:41 serhiy.storchaka set messages: +
2014-12-12 06:36:09 zach.ware set messages: +
2014-12-09 12:01:46 serhiy.storchaka set messages: +
2014-12-09 10:40:38 rhettinger set messages: +
2014-12-09 10:35:27 rhettinger set messages: +
2014-11-29 22:10:38 serhiy.storchaka set messages: +
2014-11-29 22:05:20 zach.ware set files: - bad-issue22955.diff
2014-11-29 22:05:11 zach.ware set files: + issue22955.diff
2014-11-29 21:52:06 zach.ware set files: + bad-issue22955.diffmessages: +
2014-11-29 04:32:35 josh.r set files: + pickle_getter_and_caller2.patchmessages: +
2014-11-29 04:06:52 josh.r set files: - pickle_getter_and_caller.patch
2014-11-29 04:06:30 josh.r set files: + pickle_getter_and_caller.patchmessages: +
2014-11-29 03:26:12 josh.r set messages: +
2014-11-29 02:58:34 josh.r set versions: - Python 3.4
2014-11-29 02:57:36 josh.r set files: + pickle_getter_and_caller.patchversions: + Python 3.4nosy: + josh.rmessages: + keywords: + patch
2014-11-29 00:12:43 rhettinger set nosy: + rhettingermessages: + versions: - Python 3.4
2014-11-28 21:33:51 zach.ware set nosy: + zach.waretitle: Pickling of methodcaller and attrgetter -> Pickling of methodcaller, attrgetter, and itemgettermessages: + versions: + Python 3.4
2014-11-27 19:31:46 serhiy.storchaka set assignee: serhiy.storchaka
2014-11-27 16:19:51 serhiy.storchaka set messages: +
2014-11-27 13:59:36 pitrou set nosy: + pitrou, serhiy.storchakatype: enhancementversions: + Python 3.5, - Python 3.4
2014-11-27 06:43:37 rhettinger set stage: needs patch
2014-11-27 06:33:03 Antony.Lee create