pandas.DataFrame.combine — pandas 0.24.0rc1 documentation (original) (raw)

DataFrame. combine(other, func, fill_value=None, overwrite=True)[source]

Perform column-wise combine with another DataFrame based on a passed function.

Combines a DataFrame with other DataFrame using functo element-wise combine columns. The row and column indexes of the resulting DataFrame will be the union of the two.

Parameters: other : DataFrame The DataFrame to merge column-wise. func : function Function that takes two series as inputs and return a Series or a scalar. Used to merge the two dataframes column by columns. fill_value : scalar value, default None The value to fill NaNs with prior to passing any column to the merge func. overwrite : boolean, default True If True, columns in self that do not exist in other will be overwritten with NaNs.
Returns: result : DataFrame

See also

DataFrame.combine_first

Combine two DataFrame objects and default to non-null values in frame calling the method.

Examples

Combine using a simple function that chooses the smaller column.

df1 = pd.DataFrame({'A': [0, 0], 'B': [4, 4]}) df2 = pd.DataFrame({'A': [1, 1], 'B': [3, 3]}) take_smaller = lambda s1, s2: s1 if s1.sum() < s2.sum() else s2 df1.combine(df2, take_smaller) A B 0 0 3 1 0 3

Example using a true element-wise combine function.

df1 = pd.DataFrame({'A': [5, 0], 'B': [2, 4]}) df2 = pd.DataFrame({'A': [1, 1], 'B': [3, 3]}) df1.combine(df2, np.minimum) A B 0 1 2 1 0 3

Using fill_value fills Nones prior to passing the column to the merge function.

df1 = pd.DataFrame({'A': [0, 0], 'B': [None, 4]}) df2 = pd.DataFrame({'A': [1, 1], 'B': [3, 3]}) df1.combine(df2, take_smaller, fill_value=-5) A B 0 0 -5.0 1 0 4.0

However, if the same element in both dataframes is None, that None is preserved

df1 = pd.DataFrame({'A': [0, 0], 'B': [None, 4]}) df2 = pd.DataFrame({'A': [1, 1], 'B': [None, 3]}) df1.combine(df2, take_smaller, fill_value=-5) A B 0 0 NaN 1 0 3.0

Example that demonstrates the use of overwrite and behavior when the axis differ between the dataframes.

df1 = pd.DataFrame({'A': [0, 0], 'B': [4, 4]}) df2 = pd.DataFrame({'B': [3, 3], 'C': [-10, 1],}, index=[1, 2]) df1.combine(df2, take_smaller) A B C 0 NaN NaN NaN 1 NaN 3.0 -10.0 2 NaN 3.0 1.0

df1.combine(df2, take_smaller, overwrite=False) A B C 0 0.0 NaN NaN 1 0.0 3.0 -10.0 2 NaN 3.0 1.0

Demonstrating the preference of the passed in dataframe.

df2 = pd.DataFrame({'B': [3, 3], 'C': [1, 1],}, index=[1, 2]) df2.combine(df1, take_smaller) A B C 0 0.0 NaN NaN 1 0.0 3.0 NaN 2 NaN 3.0 NaN

df2.combine(df1, take_smaller, overwrite=False) A B C 0 0.0 NaN NaN 1 0.0 3.0 1.0 2 NaN 3.0 1.0