ENH: more joins for DataFrame.update · Issue #21855 · pandas-dev/pandas (original) (raw)
Navigation Menu
- Explore
- Pricing
Provide feedback
Saved searches
Use saved searches to filter your results more quickly
Appearance settings
Description
This should be non-controversial, as even the source code for DataFrame.update
literally says (https://github.com/pandas-dev/pandas/blob/v0.23.3/pandas/core/frame.py#L5054):# TODO: Support other joins
I tried to look if a previous issue for this exists, but did not find one.
Some thoughts that arise:
- default for
join
should clearly be'left'
df1.update(df2, join='right', overwrite=some_boolean)
would be the same asdf2.update(df1, join='left', overwrite=not some_boolean)
. IMO this is not a terrible redundancy, as it allows each user to choose the order that more easily fits their thought pattern.df1.combine_first(df2)
would be the same asdf1.update(df2, join='outer', overwrite=False)
, only thatcombine_first
has much fewer options and controls (i.e.filter_func
andraise_conflict
). Actually, I'd very much like to deprecatecombine_first
before pandas 1.0. Only difference is thatupdate
returns None, which should be changed as well IMO -- relevant xrefs: ENH: add inplace-kwarg to update #21858 DEPR: combine_first (replace with update(..., join='outer'); for both Series/DF) #21859- this should IMO also support a way to control which axes are joined in what way (edit: the below was the original proposal; better variants are discussed in ENH: more joins for DataFrame.update #21855 (comment)).
- The first way that came to mind would be with an
axis=0|1|None
-keyword, like inDataFrame.align
. However, upon further investigation, I don't believe this to be a good choice, as anything other thanaxis=None
would implicitly have to choose ajoin
for the other axis to actually decide the index/columns of the result. - Since "explicit is better than implicit", I'd like to propose a version with just one kwarg, namely:
- The first way that came to mind would be with an
join=['left', 'left'] # same as 'left' (and so on for 'inner'|'outer'|'right')
join=['left', 'inner'] | ['left', 'outer'] etc. (for all other 12 combinations)
- I'd say
list
andtuple
would be reasonable to allow as containers, but not more.