[Python-Dev] Accepting PEP 3154 for 3.4? (original) (raw)
Antoine Pitrou solipsis at pitrou.net
Tue Nov 19 00:10:14 CET 2013
- Previous message: [Python-Dev] Accepting PEP 3154 for 3.4?
- Next message: [Python-Dev] Accepting PEP 3154 for 3.4?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Ok, how about merging the two sub-threads :-)
On Mon, 18 Nov 2013 16:44:59 -0600 Tim Peters <tim.peters at gmail.com> wrote:
[Antoine] > You can't know how much space the pickle will take until the pickling > ends, though, which makes it difficult to decide whether you want to > emit a PREFETCH opcode or not.
Ah, of course. Presumably the outgoing pickle stream is first stored in some memory buffer, right? If pickling completes before the buffer is first flushed, then you know exactly how large the entire pickle is. If "it's small" (say, < 100 bytes), don't write out the PREFETCH part. Else do.
That's true. We could also have a SMALLPREFETCH opcode with a one-byte length to still get the benefits of prefetching.
> Well, yes: much better memory usage for large pickles. > Some people use pickles to store huge data, which was the motivation to > add the 8-byte-size opcodes after all.
We'd have the same advantage if it were feasible to know the entire size up front. I understand now that it's not feasible.
AFAICT, it would only be possible by doing two-pass pickling, which would also slow it down massively.
A long-running process can legitimately put billions of items on work queues, far more than could ever fit in RAM simultaneously. Comparing this to PyObject overhead makes no sense to me. Neither does the line of argument "there are several kinds of overheads, so making this overhead worse too doesn't matter".
Well, it's a question of cost / benefit: does it make sense to optimize something that will be dwarfed by other factors in real world situations?
When possible, we should strive not to add overheads that don't repay their costs. For small pickles, an 8-byte size field doesn't appear to buy anything. But I appreciate that it costs implementation effort to avoid producing it in these cases.
I share the concern, although I still don't think the "ocean of tiny pickles" is a reasonable use case :-)
That said, assuming you think this is important (do you?), we're left with the following constraints:
- it would be nice to have this PEP in 3.4
- 3.4 beta1 and feature freeze is in approximately one week
- switching to the PREFETCH scheme requires some non-trivial work on the current patch, work done by either Alexandre or me (but I already have pathlib (PEP 428) on my plate, so it'll have to be Alexandre) - unless you want to do it, of course?
What do you think?
Regards
Antoine.
- Previous message: [Python-Dev] Accepting PEP 3154 for 3.4?
- Next message: [Python-Dev] Accepting PEP 3154 for 3.4?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]