[Python-Dev] Proposal: dict.with_values(iterable) (original) (raw)
Mark Shannon mark at hotpy.org
Tue Apr 23 17:17:02 EDT 2019
- Previous message (by thread): [Python-Dev] Proposal: dict.with_values(iterable)
- Next message (by thread): [Python-Dev] Proposal: dict.with_values(iterable)
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi,
On 12/04/2019 2:44 pm, Inada Naoki wrote:
Hi, all.
I propose adding new method: dict.withvalues(iterable)
You can already do something like this, if memory saving is the main concern. This should work on all versions from 3.3.
def shared_keys_dict_maker(keys): class C: pass instance = C() for key in keys: for key in keys: setattr(instance, key, None) prototype = instance.dict def maker(values): result = prototype.copy() result.update(zip(keys, values)) return result return maker
m = shared_keys_dict_maker(('a', 'b'))
d1 = {'a':1, 'b':2} print(sys.getsizeof(d1)) ... 248
d2 = m((1,2)) print(sys.getsizeof(d2)) ... 120
d3 = m((None,"Hi")) print(sys.getsizeof(d3)) ... 120
# Motivation Python is used to handle data. While dict is not efficient way to handle may records, it is still convenient way. When creating many dicts with same keys, dict need to lookup internal hash table while inserting each keys. It is costful operation. If we can reuse existing keys of dict, we can skip this inserting cost. Additionally, we have "Key-Sharing Dictionary (PEP 412)". When all keys are string, many dict can share one key. It reduces memory consumption. This might be usable for: * csv.DictReader * namedtuple.asdict() * DB-API 2.0 implementations: (e.g. DictCursor of mysqlclient-python)
# Draft implementation pull request: https://github.com/python/cpython/pull/12802 withvalues(self, iterable, /) Create a new dictionary with keys from this dict and values from iterable. When length of iterable is different from len(self), ValueError is raised. This method does not support dict subclass. ## Memory usage (Key-Sharing dict)
import sys keys = tuple("abcdefg") keys ('a', 'b', 'c', 'd', 'e', 'f', 'g') d = dict(zip(keys, range(7))) d {'a': 0, 'b': 1, 'c': 2, 'd': 3, 'e': 4, 'f': 5, 'g': 6} sys.getsizeof(d) 360 keys = dict.fromkeys("abcdefg") d = keys.withvalues(range(7)) d {'a': 0, 'b': 1, 'c': 2, 'd': 3, 'e': 4, 'f': 5, 'g': 6} sys.getsizeof(d) 144 ## Speed $ ./python -m perf timeit -o zipdict.json -s 'keys = tuple("abcdefg"); values=[*range(7)]' 'dict(zip(keys, values))' $ ./python -m perf timeit -o withvalues.json -s 'keys = dict.fromkeys("abcdefg"); values=[*range(7)]' 'keys.withvalues(values)' $ ./python -m perf compareto zipdict.json withvalues.json Mean +- std dev: [zipdict] 935 ns +- 9 ns -> [withvalues] 109 ns +- 2 ns: 8.59x faster (-88%) How do you think? Any comments are appreciated. Regards,
- Previous message (by thread): [Python-Dev] Proposal: dict.with_values(iterable)
- Next message (by thread): [Python-Dev] Proposal: dict.with_values(iterable)
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]