PERF: change impl for Categorical to use smaller dtype arrays · Issue #8453 · pandas-dev/pandas (original) (raw)
Navigation Menu
- Explore
- Pricing
Provide feedback
Saved searches
Use saved searches to filter your results more quickly
Appearance settings
Description
So it seems by using a full Int64 array for the codes, plus the categorires we are actually using MORE memory to store a Categorical. Because the pointers are the same sized as an object array (plus have the categories).
So need to change the codes store to use a smaller dtype of int. Maybe switch this to a plain ndarray, and use dtype=uint8
. Would provide a lot of benefit