PERF: Unnecessary hash table with RangeIndex · Issue #16685 · pandas-dev/pandas (original) (raw)
Navigation Menu
- Explore
- Pricing
Provide feedback
Saved searches
Use saved searches to filter your results more quickly
Description
Example
def log_memory(): import os import gc import psutil for i in range(3): gc.collect(i) process = psutil.Process(os.getpid()) mem_usage = process.memory_info().rss / float(2 ** 20) print("[Memory usage] {:12.1f} MB".format( mem_usage ))
In [20]: df = pd.DataFrame({'a': np.arange(1000000)})
In [23]: log_memory() [Memory usage] 132.4 MB
In [24]: df.loc[5, :] Out[24]: a 5 Name: 5, dtype: int32
In [25]: log_memory() [Memory usage] 172.2 MB
Rather than materializing the hash table, should directly convert labels into positions. Low priority in my opinion, atypical to be using loc
with a RangeIndex
.
pandas 0.20.2