pandas.DataFrame.update — pandas 3.0.0.dev0+2100.gf496acffcc documentation (original) (raw)
DataFrame.update(other, join='left', overwrite=True, filter_func=None, errors='ignore')[source]#
Modify in place using non-NA values from another DataFrame.
Aligns on indices. There is no return value.
Parameters:
otherDataFrame, or object coercible into a DataFrame
Should have at least one matching index/column label with the original DataFrame. If a Series is passed, its name attribute must be set, and that will be used as the column name to align with the original DataFrame.
join{‘left’}, default ‘left’
Only left join is implemented, keeping the index and columns of the original object.
overwritebool, default True
How to handle non-NA values for overlapping keys:
- True: overwrite original DataFrame’s values with values from other.
- False: only update values that are NA in the original DataFrame.
filter_funccallable(1d-array) -> bool 1d-array, optional
Can choose to replace values other than NA. Return True for values that should be updated.
errors{‘raise’, ‘ignore’}, default ‘ignore’
If ‘raise’, will raise a ValueError if the DataFrame and otherboth contain non-NA data in the same place.
Returns:
None
This method directly changes calling object.
Raises:
ValueError
- When errors=’raise’ and there’s overlapping non-NA data.
- When errors is not either ‘ignore’ or ‘raise’
NotImplementedError
- If join != ‘left’
Notes
- Duplicate indices on other are not supported and raises ValueError.
Examples
df = pd.DataFrame({"A": [1, 2, 3], "B": [400, 500, 600]}) new_df = pd.DataFrame({"B": [4, 5, 6], "C": [7, 8, 9]}) df.update(new_df) df A B 0 1 4 1 2 5 2 3 6
The DataFrame’s length does not increase as a result of the update, only values at matching index/column labels are updated.
df = pd.DataFrame({"A": ["a", "b", "c"], "B": ["x", "y", "z"]}) new_df = pd.DataFrame({"B": ["d", "e", "f", "g", "h", "i"]}) df.update(new_df) df A B 0 a d 1 b e 2 c f
df = pd.DataFrame({"A": ["a", "b", "c"], "B": ["x", "y", "z"]}) new_df = pd.DataFrame({"B": ["d", "f"]}, index=[0, 2]) df.update(new_df) df A B 0 a d 1 b y 2 c f
For Series, its name attribute must be set.
df = pd.DataFrame({"A": ["a", "b", "c"], "B": ["x", "y", "z"]}) new_column = pd.Series(["d", "e", "f"], name="B") df.update(new_column) df A B 0 a d 1 b e 2 c f
If other contains NaNs the corresponding values are not updated in the original dataframe.
df = pd.DataFrame({"A": [1, 2, 3], "B": [400.0, 500.0, 600.0]}) new_df = pd.DataFrame({"B": [4, np.nan, 6]}) df.update(new_df) df A B 0 1 4.0 1 2 500.0 2 3 6.0