Create a New Column in Pandas DataFrame based on the Existing Columns (original) (raw)
Last Updated : 23 Apr, 2025
When working with data in **Pandas, we often need to change or organize the data into a format we want. One common task is adding new columns based on calculations or changes made to the existing columns in a DataFrame. In this article, we will be exploring different ways to do that.
**Task: We have a DataFrame containing event data and we want to create a new column called ‘Discounted_Price’. This new column will be calculated by applying a 10% discount to the existing ‘Ticket price’ column.
Using the _apply() Function
The **apply() function allows us to apply a custom function across rows or columns. In this example, we’ll use **apply() to create a new column, **Discounted_Price, which applies a 10% discount to the **Cost column.
Python `
import pandas as pd
df = pd.DataFrame({'Date':['10/2/2011', '11/2/2011', '12/2/2011', '13/2/2011'], 'Event':['Music', 'Poetry', 'Theatre', 'Comedy'], 'Cost':[10000, 5000, 15000, 2000]})
print(df)
`
**Output:
Date Event Cost
0 10/2/2011 Music 10000
1 11/2/2011 Poetry 5000
2 12/2/2011 Theatre 15000
3 13/2/2011 Comedy 2000
Now we will create a new column called ‘_**Discounted_Price‘ after applying a 10% discount on the existing ‘**Cost_‘ column.
Python `
df['Discounted_Price'] = df.apply(lambda row: row.Cost - (row.Cost * 0.1), axis = 1)
print(df)
`
**Output :
Date Event Cost Discounted_Price
0 10/2/2011 Music 10000 9000.0
1 11/2/2011 Poetry 5000 4500.0
2 12/2/2011 Theatre 15000 13500.0
3 13/2/2011 Comedy 2000 1800.0
Element-wise Operation on Columns
Another simpler approach to create a new column is to perform an element-wise operation on an existing column. Here, we will directly apply the discount calculation to the **Cost column.
Python `
import pandas as pd
df = pd.DataFrame({'Date':['10/2/2011', '11/2/2011', '12/2/2011', '13/2/2011'], 'Event':['Music', 'Poetry', 'Theatre', 'Comedy'], 'Cost':[10000, 5000, 15000, 2000]})
df['Discounted_Price'] = df['Cost'] - (0.1 * df['Cost'])
print(df)
`
**Output :
Date Event Cost Discounted_Price
0 10/2/2011 Music 10000 9000.0
1 11/2/2011 Poetry 5000 4500.0
2 12/2/2011 Theatre 15000 13500.0
3 13/2/2011 Comedy 2000 1800.0
Using map() Function for Mapping Values
The **map() function is useful when you want to map one set of values to another. In this example, we’ll create a new column called **salary_stats based on the salary column by using a mapping function.
Python `
data = { "name": ["John", "Ted", "Dev", "Brad", "Rex", "Smith", "Samuel", "David"], "salary": [10000, 20000, 50000, 45500, 19800, 95000, 5000, 50000] }
df = pd.DataFrame(data) display(df.head())
`
**Output:
name salary
0 John 10000
1 Ted 20000
2 Dev 50000
3 Brad 45500
4 Rex 19800
Now, we will create a mapping function (**salary_stats) and use the **DataFrame.map() function to create a new column from an existing column
Python `
def salary_stats(value): if value < 10000: return "very low" if 10000 <= value < 25000: return "low" elif 25000 <= value < 40000: return "average" elif 40000 <= value < 50000: return "better" elif value >= 50000: return "very good"
df['salary_stats'] = df['salary'].map(salary_stats) display(df.head())
`
**Output:
name salary salary_stats
0 John 10000 low
1 Ted 20000 low
2 Dev 50000 very good
3 Brad 45500 better
4 Rex 19800 low
In this example, we created a new column **salary_stats based on the salary using the **map() function. The function categorizes salary values into different groups like “very low”, “low”, “average”, “better” and”very good”.
**Read More: