pandas.Series.replace — pandas 2.2.3 documentation (original) (raw)

Series.replace(to_replace=None, value=<no_default>, *, inplace=False, limit=None, regex=False, method=<no_default>)[source]#

Replace values given in to_replace with value.

Values of the Series/DataFrame are replaced with other values dynamically. This differs from updating with .loc or .iloc, which require you to specify a location to update with some value.

Parameters:

to_replacestr, regex, list, dict, Series, int, float, or None

How to find the values that will be replaced.

See the examples section for examples of each of these.

valuescalar, dict, list, str, regex, default None

Value to replace any values matching to_replace with. For a DataFrame a dict of values can be used to specify which value to use for each column (columns not in the dict will not be filled). Regular expressions, strings and lists or dicts of such objects are also allowed.

inplacebool, default False

If True, performs operation inplace and returns None.

limitint, default None

Maximum size gap to forward or backward fill.

Deprecated since version 2.1.0.

regexbool or same types as to_replace, default False

Whether to interpret to_replace and/or value as regular expressions. Alternatively, this could be a regular expression or a list, dict, or array of regular expressions in which caseto_replace must be None.

method{‘pad’, ‘ffill’, ‘bfill’}

The method to use when for replacement, when to_replace is a scalar, list or tuple and value is None.

Deprecated since version 2.1.0.

Returns:

Series/DataFrame

Object after replacement.

Raises:

AssertionError

TypeError

ValueError

Notes

Examples

Scalar `to_replace` and `value`

s = pd.Series([1, 2, 3, 4, 5]) s.replace(1, 5) 0 5 1 2 2 3 3 4 4 5 dtype: int64

df = pd.DataFrame({'A': [0, 1, 2, 3, 4], ... 'B': [5, 6, 7, 8, 9], ... 'C': ['a', 'b', 'c', 'd', 'e']}) df.replace(0, 5) A B C 0 5 5 a 1 1 6 b 2 2 7 c 3 3 8 d 4 4 9 e

List-like `to_replace`

df.replace([0, 1, 2, 3], 4) A B C 0 4 5 a 1 4 6 b 2 4 7 c 3 4 8 d 4 4 9 e

df.replace([0, 1, 2, 3], [4, 3, 2, 1]) A B C 0 4 5 a 1 3 6 b 2 2 7 c 3 1 8 d 4 4 9 e

s.replace([1, 2], method='bfill') 0 3 1 3 2 3 3 4 4 5 dtype: int64

dict-like `to_replace`

df.replace({0: 10, 1: 100}) A B C 0 10 5 a 1 100 6 b 2 2 7 c 3 3 8 d 4 4 9 e

df.replace({'A': 0, 'B': 5}, 100) A B C 0 100 100 a 1 1 6 b 2 2 7 c 3 3 8 d 4 4 9 e

df.replace({'A': {0: 100, 4: 400}}) A B C 0 100 5 a 1 1 6 b 2 2 7 c 3 3 8 d 4 400 9 e

Regular expression `to_replace`

df = pd.DataFrame({'A': ['bat', 'foo', 'bait'], ... 'B': ['abc', 'bar', 'xyz']}) df.replace(to_replace=r'^ba.$', value='new', regex=True) A B 0 new abc 1 foo new 2 bait xyz

df.replace({'A': r'^ba.$'}, {'A': 'new'}, regex=True) A B 0 new abc 1 foo bar 2 bait xyz

df.replace(regex=r'^ba.$', value='new') A B 0 new abc 1 foo new 2 bait xyz

df.replace(regex={r'^ba.$': 'new', 'foo': 'xyz'}) A B 0 new abc 1 xyz new 2 bait xyz

df.replace(regex=[r'^ba.$', 'foo'], value='new') A B 0 new abc 1 new new 2 bait xyz

Compare the behavior of s.replace({'a': None}) ands.replace('a', None) to understand the peculiarities of the to_replace parameter:

s = pd.Series([10, 'a', 'a', 'b', 'a'])

When one uses a dict as the to_replace value, it is like the value(s) in the dict are equal to the value parameter.s.replace({'a': None}) is equivalent tos.replace(to_replace={'a': None}, value=None, method=None):

s.replace({'a': None}) 0 10 1 None 2 None 3 b 4 None dtype: object

When value is not explicitly passed and to_replace is a scalar, list or tuple, replace uses the method parameter (default ‘pad’) to do the replacement. So this is why the ‘a’ values are being replaced by 10 in rows 1 and 2 and ‘b’ in row 4 in this case.

s.replace('a') 0 10 1 10 2 10 3 b 4 b dtype: object

Deprecated since version 2.1.0: The ‘method’ parameter and padding behavior are deprecated.

On the other hand, if None is explicitly passed for value, it will be respected:

s.replace('a', None) 0 10 1 None 2 None 3 b 4 None dtype: object

Changed in version 1.4.0: Previously the explicit None was silently ignored.

When regex=True, value is not None and to_replace is a string, the replacement will be applied in all columns of the DataFrame.

df = pd.DataFrame({'A': [0, 1, 2, 3, 4], ... 'B': ['a', 'b', 'c', 'd', 'e'], ... 'C': ['f', 'g', 'h', 'i', 'j']})

df.replace(to_replace='^[a-g]', value='e', regex=True) A B C 0 0 e e 1 1 e e 2 2 e h 3 3 e i 4 4 e j

If value is not None and to_replace is a dictionary, the dictionary keys will be the DataFrame columns that the replacement will be applied.

df.replace(to_replace={'B': '^[a-c]', 'C': '^[h-j]'}, value='e', regex=True) A B C 0 0 e f 1 1 e g 2 2 e e 3 3 d e 4 4 e e