pandas.DataFrame.combine — pandas 2.2.3 documentation (original) (raw)
DataFrame.combine(other, func, fill_value=None, overwrite=True)[source]#
Perform column-wise combine with another DataFrame.
Combines a DataFrame with other DataFrame using functo element-wise combine columns. The row and column indexes of the resulting DataFrame will be the union of the two.
Parameters:
otherDataFrame
The DataFrame to merge column-wise.
funcfunction
Function that takes two series as inputs and return a Series or a scalar. Used to merge the two dataframes column by columns.
fill_valuescalar value, default None
The value to fill NaNs with prior to passing any column to the merge func.
overwritebool, default True
If True, columns in self that do not exist in other will be overwritten with NaNs.
Returns:
DataFrame
Combination of the provided DataFrames.
See also
Combine two DataFrame objects and default to non-null values in frame calling the method.
Examples
Combine using a simple function that chooses the smaller column.
df1 = pd.DataFrame({'A': [0, 0], 'B': [4, 4]}) df2 = pd.DataFrame({'A': [1, 1], 'B': [3, 3]}) take_smaller = lambda s1, s2: s1 if s1.sum() < s2.sum() else s2 df1.combine(df2, take_smaller) A B 0 0 3 1 0 3
Example using a true element-wise combine function.
df1 = pd.DataFrame({'A': [5, 0], 'B': [2, 4]}) df2 = pd.DataFrame({'A': [1, 1], 'B': [3, 3]}) df1.combine(df2, np.minimum) A B 0 1 2 1 0 3
Using fill_value fills Nones prior to passing the column to the merge function.
df1 = pd.DataFrame({'A': [0, 0], 'B': [None, 4]}) df2 = pd.DataFrame({'A': [1, 1], 'B': [3, 3]}) df1.combine(df2, take_smaller, fill_value=-5) A B 0 0 -5.0 1 0 4.0
However, if the same element in both dataframes is None, that None is preserved
df1 = pd.DataFrame({'A': [0, 0], 'B': [None, 4]}) df2 = pd.DataFrame({'A': [1, 1], 'B': [None, 3]}) df1.combine(df2, take_smaller, fill_value=-5) A B 0 0 -5.0 1 0 3.0
Example that demonstrates the use of overwrite and behavior when the axis differ between the dataframes.
df1 = pd.DataFrame({'A': [0, 0], 'B': [4, 4]}) df2 = pd.DataFrame({'B': [3, 3], 'C': [-10, 1], }, index=[1, 2]) df1.combine(df2, take_smaller) A B C 0 NaN NaN NaN 1 NaN 3.0 -10.0 2 NaN 3.0 1.0
df1.combine(df2, take_smaller, overwrite=False) A B C 0 0.0 NaN NaN 1 0.0 3.0 -10.0 2 NaN 3.0 1.0
Demonstrating the preference of the passed in dataframe.
df2 = pd.DataFrame({'B': [3, 3], 'C': [1, 1], }, index=[1, 2]) df2.combine(df1, take_smaller) A B C 0 0.0 NaN NaN 1 0.0 3.0 NaN 2 NaN 3.0 NaN
df2.combine(df1, take_smaller, overwrite=False) A B C 0 0.0 NaN NaN 1 0.0 3.0 1.0 2 NaN 3.0 1.0