[Python-3000] bytes and dicts (was: PEP 3137: Immutable Bytes and Mutable Buffer) (original) (raw)

Jim Jewett jimjjewett at gmail.com
Fri Sep 28 20:33:04 CEST 2007


On 9/28/07, Guido van Rossum <guido at python.org> wrote:

Well, maybe this is a good enough argument to give up.

Not quite yet... I still see two potential solutions, depending on whether or not the exclusion is sticky. Details below.

=========

If the exclusion is sticky, then add (implicit) flags saying "seen a string" and "seen a byte". Similar logic is already there, in that "seen a non-string" replaces the lookdict function.

The most common case (exact unicode in an exact unicode-only dict) would stay the same as today, but the other cases would have some extra type-checking.

=========

If the exclusion is based on current contents, then we can add a count; my concern is that keeping this efficient may be too hacky.

It looks like there is room for exactly one more pointer (-sized count variable) before small dicts bleed to a third cacheline. Because of this guard, bytes and strings can never appear in the same dict, so at least one count is zero. Because dict entries are 3 pointers long, there can never be more than (Py_ssize_t / 2) entries, so the sign bit can be repurposed to indicate whether the count refers to strings or bytes. (count==0 means no bytes or strings; count==5 means 5 string keys; count==-32 means 32 bytes keys.)

-jJ



More information about the Python-3000 mailing list