DEPR: make_block by jbrockmendel · Pull Request #56422 · pandas-dev/pandas (original) (raw)
OK, I see what happened! I tested this before (with a slightly updated example from Brocks post to interleave the dtypes), and timed with the default mode (around 10ms instead of 20µs), and based on the profile this was because of the Block.take_nd in the reindex step. Then I enabled CoW and it also gave around 10ms, but didn't check the profile (just assumed it had the same cause, also because I was already passing copy=False
to reindex assuming we would honor that regardless of CoW).
But redoing that example now, the reason it was also slower with CoW is because I was missing a copy=False
in the DataFrame(arr)
constructor (where we changed the default to True when CoW is enabled for ndarray input), and apparently that gave more or less the same slowdown as the take in the non-CoW mode ..
With correcting that, I indeed get a zero-copy construction with the correct column order. The overhead in this case is around 400µs vs 20µs for me.
This (concat index and reindexing) overhead grows with the number of columns, and I will also try to check next week the impact of more dtypes, to see if the overhead stays acceptable.
Adapted example:
In [1]: import pandas as pd
...: import numpy as np
...: from pandas.core.internals.api import make_block
...: from pandas.core.internals import create_block_manager_from_blocks
...:
...: ncols = 10
...: nrows = 10**5
...:
...: arr1 = np.random.randn(nrows*ncols).reshape(nrows, ncols)
...: arr2 = arr1.astype(np.int64)
...:
...: index = pd.Index(range(nrows))
...: cols = pd.Index(range(ncols * 2))
...: axes = [cols, index]
<ipython-input-1-5c31cca232e4>:4: DeprecationWarning: create_block_manager_from_blocks is deprecated and will be removed in a future version. Use public APIs instead.
from pandas.core.internals import create_block_manager_from_blocks
In [2]: def v1():
...: locs1 = np.arange(arr1.shape[1]*2, step=2)
...: blk1 = make_block(arr1.T, placement=locs1)
...: locs2 = np.arange(1, arr1.shape[1]*2, step=2)
...: blk2 = make_block(arr2.T, placement=locs2)
...: mgr = create_block_manager_from_blocks([blk1, blk2], axes, verify_integrity=False)
...: return pd.DataFrame._from_mgr(mgr, axes=axes[::-1])
...:
In [3]: def v2():
...: left = pd.DataFrame(arr1, columns=cols[0:ncols*2:2], copy=False)
...: right = pd.DataFrame(arr2, columns=cols[1:ncols*2:2], copy=False)
...: return pd.concat([left, right], axis=1, copy=False).reindex(columns=cols, copy=False)
...:
In [4]: %timeit v1()
24.6 µs ± 399 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
In [5]: %timeit v2()
16.1 ms ± 1.76 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
In [6]: df1 = v1()
In [7]: df2 = v2()
In [8]: pd.testing.assert_frame_equal(df1, df2)
In [9]: pd.options.mode.copy_on_write = True
In [10]: %timeit v2()
463 µs ± 8.93 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)