pandas.DataFrame.combine — pandas 0.24.0rc1 documentation (original) (raw)
DataFrame.
combine
(other, func, fill_value=None, overwrite=True)[source]¶
Perform column-wise combine with another DataFrame based on a passed function.
Combines a DataFrame with other DataFrame using functo element-wise combine columns. The row and column indexes of the resulting DataFrame will be the union of the two.
Parameters: | other : DataFrame The DataFrame to merge column-wise. func : function Function that takes two series as inputs and return a Series or a scalar. Used to merge the two dataframes column by columns. fill_value : scalar value, default None The value to fill NaNs with prior to passing any column to the merge func. overwrite : boolean, default True If True, columns in self that do not exist in other will be overwritten with NaNs. |
---|---|
Returns: | result : DataFrame |
See also
Combine two DataFrame objects and default to non-null values in frame calling the method.
Examples
Combine using a simple function that chooses the smaller column.
df1 = pd.DataFrame({'A': [0, 0], 'B': [4, 4]}) df2 = pd.DataFrame({'A': [1, 1], 'B': [3, 3]}) take_smaller = lambda s1, s2: s1 if s1.sum() < s2.sum() else s2 df1.combine(df2, take_smaller) A B 0 0 3 1 0 3
Example using a true element-wise combine function.
df1 = pd.DataFrame({'A': [5, 0], 'B': [2, 4]}) df2 = pd.DataFrame({'A': [1, 1], 'B': [3, 3]}) df1.combine(df2, np.minimum) A B 0 1 2 1 0 3
Using fill_value fills Nones prior to passing the column to the merge function.
df1 = pd.DataFrame({'A': [0, 0], 'B': [None, 4]}) df2 = pd.DataFrame({'A': [1, 1], 'B': [3, 3]}) df1.combine(df2, take_smaller, fill_value=-5) A B 0 0 -5.0 1 0 4.0
However, if the same element in both dataframes is None, that None is preserved
df1 = pd.DataFrame({'A': [0, 0], 'B': [None, 4]}) df2 = pd.DataFrame({'A': [1, 1], 'B': [None, 3]}) df1.combine(df2, take_smaller, fill_value=-5) A B 0 0 NaN 1 0 3.0
Example that demonstrates the use of overwrite and behavior when the axis differ between the dataframes.
df1 = pd.DataFrame({'A': [0, 0], 'B': [4, 4]}) df2 = pd.DataFrame({'B': [3, 3], 'C': [-10, 1],}, index=[1, 2]) df1.combine(df2, take_smaller) A B C 0 NaN NaN NaN 1 NaN 3.0 -10.0 2 NaN 3.0 1.0
df1.combine(df2, take_smaller, overwrite=False) A B C 0 0.0 NaN NaN 1 0.0 3.0 -10.0 2 NaN 3.0 1.0
Demonstrating the preference of the passed in dataframe.
df2 = pd.DataFrame({'B': [3, 3], 'C': [1, 1],}, index=[1, 2]) df2.combine(df1, take_smaller) A B C 0 0.0 NaN NaN 1 0.0 3.0 NaN 2 NaN 3.0 NaN
df2.combine(df1, take_smaller, overwrite=False) A B C 0 0.0 NaN NaN 1 0.0 3.0 1.0 2 NaN 3.0 1.0