[Python-Dev] Accepting PEP 3154 for 3.4? (original) (raw)
"Martin v. Löwis" martin at v.loewis.de
Wed Nov 20 00:56:13 CET 2013
- Previous message: [Python-Dev] Accepting PEP 3154 for 3.4?
- Next message: [Python-Dev] Accepting PEP 3154 for 3.4?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Am 19.11.13 23:50, schrieb Antoine Pitrou:
Ok, thanks. So now that I look at the patch I see the following problems with this idea:
- "pickle + framing" becomes a different protocol than "pickle" alone, which means we lose the benefit of protocol autodetection. It's as though pickle.load() required you to give the protocol number, instead of inferring it from the pickle bytestream.
Not necessarily. Framing becomes a different protocol, yes. But autodetection would still be possible (it actually is possible in my proposed definition).
- it is less efficient than framing built inside pickle, since it adds separate buffers and memory copies (while the point of framing is to make buffering more efficient).
Correct. However, if the intent is to reduce the number of system calls, then this is still achieved.
Your idea is morally similar to saying "we don't need to optimize the size of pickles, since you can gzip them anyway".
Not really. In the case of gzip, it might be that the size reduction of properly saving bytes in pickle might be even larger. Here, the wire representation, and the number of system calls is actually (nearly) identical.
However, the fact that the pickle module currently goes to lengths to try to optimize buffering, implies to me that it's reasonable to also improve the pickle protocol so as to optimize buffering.
AFAICT, the real driving force is the desire to not read-ahead more than the pickle is long. This is what complicates the code. The easiest (and most space-efficient) solution to that problem would be to prefix the entire pickle with a data size field (possibly in a variable-length representation), i.e. to make a single frame.
If that was done, I would guess that Tim's concerns about brittleness would go away (as you couldn't have a length field in the middle of data). IMO, the PEP has nearly the same flaw as the HTTP chunked transfer, which also puts length fields in the middle of the payload (except that HTTP makes it worse by making them optional).
Of course, a single length field has other drawbacks, such as having to pickle everything before sending out the first bytes.
Regards, Martin
- Previous message: [Python-Dev] Accepting PEP 3154 for 3.4?
- Next message: [Python-Dev] Accepting PEP 3154 for 3.4?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]