[Python-Dev] Proposal: dict.with_values(iterable) (original) (raw)
Inada Naoki songofacandy at gmail.com
Fri Apr 12 09:44:05 EDT 2019
- Previous message (by thread): [Python-Dev] checking "errno" for math operaton is safe to determine the error status?
- Next message (by thread): [Python-Dev] Proposal: dict.with_values(iterable)
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi, all.
I propose adding new method: dict.with_values(iterable)
Motivation
Python is used to handle data. While dict is not efficient way to handle may records, it is still convenient way.
When creating many dicts with same keys, dict need to lookup internal hash table while inserting each keys.
It is costful operation. If we can reuse existing keys of dict, we can skip this inserting cost.
Additionally, we have "Key-Sharing Dictionary (PEP 412)". When all keys are string, many dict can share one key. It reduces memory consumption.
This might be usable for:
- csv.DictReader
- namedtuple._asdict()
- DB-API 2.0 implementations: (e.g. DictCursor of mysqlclient-python)
Draft implementation
pull request: https://github.com/python/cpython/pull/12802
with_values(self, iterable, /) Create a new dictionary with keys from this dict and values from iterable.
When length of iterable is different from len(self), ValueError is raised.
This method does not support dict subclass.
Memory usage (Key-Sharing dict)
import sys keys = tuple("abcdefg") keys ('a', 'b', 'c', 'd', 'e', 'f', 'g') d = dict(zip(keys, range(7))) d {'a': 0, 'b': 1, 'c': 2, 'd': 3, 'e': 4, 'f': 5, 'g': 6} sys.getsizeof(d) 360
keys = dict.fromkeys("abcdefg") d = keys.withvalues(range(7)) d {'a': 0, 'b': 1, 'c': 2, 'd': 3, 'e': 4, 'f': 5, 'g': 6} sys.getsizeof(d) 144
Speed
$ ./python -m perf timeit -o zip_dict.json -s 'keys = tuple("abcdefg"); values=[*range(7)]' 'dict(zip(keys, values))'
$ ./python -m perf timeit -o with_values.json -s 'keys = dict.fromkeys("abcdefg"); values=[*range(7)]' 'keys.with_values(values)'
$ ./python -m perf compare_to zip_dict.json with_values.json Mean +- std dev: [zip_dict] 935 ns +- 9 ns -> [with_values] 109 ns +- 2 ns: 8.59x faster (-88%)
How do you think? Any comments are appreciated.
Regards,
Inada Naoki <songofacandy at gmail.com>
- Previous message (by thread): [Python-Dev] checking "errno" for math operaton is safe to determine the error status?
- Next message (by thread): [Python-Dev] Proposal: dict.with_values(iterable)
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]