PERF: Unnecessary hash table with RangeIndex · Issue #16685 · pandas-dev/pandas (original) (raw)

Skip to content

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sign up

@chris-b1

Description

@chris-b1

Example

def log_memory(): import os import gc import psutil for i in range(3): gc.collect(i) process = psutil.Process(os.getpid()) mem_usage = process.memory_info().rss / float(2 ** 20) print("[Memory usage] {:12.1f} MB".format( mem_usage ))

In [20]: df = pd.DataFrame({'a': np.arange(1000000)})

In [23]: log_memory() [Memory usage] 132.4 MB

In [24]: df.loc[5, :] Out[24]: a 5 Name: 5, dtype: int32

In [25]: log_memory() [Memory usage] 172.2 MB

Rather than materializing the hash table, should directly convert labels into positions. Low priority in my opinion, atypical to be using loc with a RangeIndex.

pandas 0.20.2