HDFStore.select slowed by decode even when using columns= · Issue #5441 · pandas-dev/pandas (original) (raw)
I realized when profiling a slow select (200% more wall-time as direct pytables call and high memory usage) that most of the time is spend inside bytes.decode called by _unconvert_strings_array, even when selecting only int64 columns. It seems spend time and memory to decode string that are never returned.
I'm using python 3.3 and latest pandas (commit 2d2e8b5).
I gladly get back with more details if needed.