[Python-Dev] Accepting PEP 3154 for 3.4? (original) (raw)

Tim Peters tim.peters at gmail.com
Wed Nov 20 06🔞09 CET 2013


[Martin v. Löwis]

... AFAICT, the real driving force is the desire to not read-ahead more than the pickle is long. This is what complicates the code. The easiest (and most space-efficient) solution to that problem would be to prefix the entire pickle with a data size field (possibly in a variable-length representation), i.e. to make a single frame.

In a bout of giddy optimism, I suggested that earlier in the thread. It would be sweet :-)

If that was done, I would guess that Tim's concerns about brittleness would go away (as you couldn't have a length field in the middle of data). IMO, the PEP has nearly the same flaw as the HTTP chunked transfer, which also puts length fields in the middle of the payload (except that HTTP makes it worse by making them optional).

Of course, a single length field has other drawbacks, such as having to pickle everything before sending out the first bytes.

And that's the killer. Pickle strings are generally produced incrementally, in smallish pieces. But that may go on for very many iterations, and there's no way to guess the final size in advance. I only see three ways to do it:

  1. Hope the whole string fits in RAM.
  2. Pickle twice, the first time just to get the final size (& throw the pickle pieces away on the first pass while summing their sizes).
  3. Flush the pickle string to disk periodically, then after it's done read it up and copy it to the intended output stream.

All of those really suck :-(

BTW, I'm not a web guy: in what way is HTTP chunked transfer mode viewed as being flawed? Everything I ever read about it seemed to think it was A Good Idea.



More information about the Python-Dev mailing list