How to Replace Values in Column Based on Condition in Pandas? (original) (raw)

Last Updated : 15 Nov, 2024

Let's explore different methods to replace values in a Pandas DataFrame column based on conditions.

Replace Values Using dataframe.loc[] Function

The **dataframe.loc[] function allows us to access a subset of rows or columns based on specific conditions, and we can replace values in those subsets.

df.loc[df['column_name'] == 'some_value', 'column_name'] = 'new_value'

Consider a dataset with columns 'name', 'gender', 'math score', and 'test preparation'. In this example, we will replace all occurrences of 'male' with 1 in the gender column.

Python `

import pandas as pd

Data

Student = { 'Name': ['John', 'Jay', 'sachin', 'Geetha', 'Amutha', 'ganesh'], 'gender': ['male', 'male', 'male', 'female', 'female', 'male'], 'math score': [50, 100, 70, 80, 75, 40], 'test preparation': ['none', 'completed', 'none', 'completed', 'completed', 'none'], }

Creating a DataFrame object

df = pd.DataFrame(Student)

Replacing 'male' with 1 in the 'gender' column

df.loc[df["gender"] == "male", "gender"] = 1 print(df)

`

**Output:

 Name  gender  math score test preparation  

0 John 1 50 none
1 Jay 1 100 completed
2 sachin 1 70 none
3 Geetha female 80 completed
4 Amutha female 75 completed
5 ganesh 1 40 none

We can replace values in Column based on Condition in Pandas using the following methods:

Replace Values Using np.where()

The **np.where() function from the NumPy library is another powerful tool for conditionally replacing values.

df['column_name'] = np.where(df['column_name'] == 'some_value', 'value_if_true', 'value_if_false')

Here, we will replace 'female' with 0 and 'male' with 1 in the gender column.

Python `

import numpy as np

Data

Student = { 'Name': ['John', 'Jay', 'sachin', 'Geetha', 'Amutha', 'ganesh'], 'gender': ['male', 'male', 'male', 'female', 'female', 'male'], 'math score': [50, 100, 70, 80, 75, 40], 'test preparation': ['none', 'completed', 'none', 'completed', 'completed', 'none'], }

Creating a DataFrame object

df = pd.DataFrame(Student)

Replacing 'female' with 0 and 'male' with 1 in the 'gender' column

df["gender"] = np.where(df["gender"] == "female", 0, 1) print(df)

`

**Output:

 Name  gender  math score test preparation  

0 John 1 50 none
1 Jay 1 100 completed
2 sachin 1 70 none
3 Geetha 0 80 completed
4 Amutha 0 75 completed
5 ganesh 1 40 none

Replace Values Using Masking

Pandas' **mask() function can be used to replace values where a condition is met.

df['column_name'].mask(df['column_name'] == 'some_value', 'new_value', inplace=True)

In this example, we replace 'female' with 0 in the gender column using the **mask() function.

Python `

import pandas as pd

Data

Student = { 'Name': ['John', 'Jay', 'sachin', 'Geetha', 'Amutha', 'ganesh'], 'gender': ['male', 'male', 'male', 'female', 'female', 'male'], 'math score': [50, 100, 70, 80, 75, 40], 'test preparation': ['none', 'completed', 'none', 'completed', 'completed', 'none'], }

Creating a DataFrame object

df = pd.DataFrame(Student)

Replacing 'female' with 0 in the 'gender' column

df['gender'].mask(df['gender'] == 'female', 0, inplace=True) print(df)

`

**Output:

 Name gender  math score test preparation  

0 John male 50 none
1 Jay male 100 completed
2 sachin male 70 none
3 Geetha 0 80 completed
4 Amutha 0 75 completed
5 ganesh male 40 none

Replace Values Using apply() and Lambda Functions

The **apply() function in combination with a lambda function is a flexible method for applying conditional replacements based on more complex logic.

Here, we will replace 'female' with 0 in the gender column using the apply() function and lambda.

Python `

import pandas as pd

Data

Student = { 'Name': ['John', 'Jay', 'sachin', 'Geetha', 'Amutha', 'ganesh'], 'gender': ['male', 'male', 'male', 'female', 'female', 'male'], 'math score': [50, 100, 70, 80, 75, 40], 'test preparation': ['none', 'completed', 'none', 'completed', 'completed', 'none'], }

Creating a DataFrame object

df = pd.DataFrame(Student)

Replacing 'female' with 0 using apply and lambda

df['gender'] = df['gender'].apply(lambda x: 0 if x == 'female' else x) print(df)

`

**Output:

 Name gender  math score test preparation  

0 John male 50 none
1 Jay male 100 completed
2 sachin male 70 none
3 Geetha 0 80 completed
4 Amutha 0 75 completed
5 ganesh male 40 none

In this article, we’ve explored four effective methods to replace values in a Pandas DataFrame column based on conditions: using loc[], np.where(), masking, and apply() with a lambda function.