[Python-Dev] PEP 509: Add a private version to dict (version 3) (original) (raw)

Victor Stinner victor.stinner at gmail.com
Wed Apr 20 19:34:55 EDT 2016


Hi,

Guido van Rossum and Jim J. Jewett suggested me to not require to always increase the dict version if a dict method does not modify its content. I modified the Changes section to only require that the version is increased when the dictionary content is modified.

I also explained the nice side effect of having an unique identifier for two empty dictionaries: it avoids a strong reference when checking if a namespace (dictionary) was replaced. Yury Selivanov's opcode cache uses this property to avoid a strong refence on builtin and global namespaces. It's also a written reply to Armin Rigo's suggestion to use the version 0 for empty dictionaries (new empty dict or for dict.clear()).

The modified Changes section:

Changes

Add a ma_version_tag field to the PyDictObject structure with the C type PY_UINT64_T, 64-bit unsigned integer. Add also a global dictionary version.

Each time a dictionary is created, the global version is incremented and the dictionary version is initialized to the global version.

Each time the dictionary content is modified, the global version must be incremented and copied to the dictionary version. Dictionary methods which can modify its content:

The choice of increasing or not the version when a dictionary method does not change its content is left to the Python implementation. A Python implementation can decide to not increase the version to avoid dictionary lookups in guards. Examples of cases when dictionary methods don't modify its content:

Setting a key to a new value equals to the old value is also considered as an operation modifying the dictionary content.

Two different empty dictionaries must have a different version to be able to identify a dictionary just by its version. It allows to verify in a guard that a namespace was not replaced without storing a strong reference to the dictionary. Using a borrowed reference does not work: if the old dictionary is destroyed, it is possible that a new dictionary is allocated at the same memory address. By the way, dictionaries don't support weak references.

The version increase must be atomic. In CPython, the Global Interpreter Lock (GIL) already protects dict methods to make changes atomic.

Example using an hypothetical dict_get_version(dict) function::

>>> d = {}
>>> dict_get_version(d)
100
>>> d['key'] = 'value'
>>> dict_get_version(d)
101
>>> d['key'] = 'new value'
>>> dict_get_version(d)
102
>>> del d['key']
>>> dict_get_version(d)
103

The field is called ma_version_tag, rather than ma_version, to suggest to compare it using version_tag == old_version_tag, rather than version <= old_version which becomes wrong after an integer overflow.

-- Victor



More information about the Python-Dev mailing list