Binary operations on Pandas DataFrame and Series (original) (raw)

Last Updated : 2 Jan, 2025

Binary operations involve applying mathematical or logical operations on two objects, typically DataFrames or Series, to produce a new result. Let's learn how binary operations work in Pandas, focusing on their usage with DataFrames and Series.

The most common binary operations include:

Pandas makes it easy to perform these operations element-wise (i.e., on a per-row or per-column basis), which is particularly useful when working with large datasets.

Binary Operations on Pandas Series

1. **Arithmetic Operations on Series

Arithmetic operations between two Series is applied element-wise. The index labels must align for the operation to work. If the indexes don’t match, Pandas will fill in missing values with NaN.

**Example: Adding Two Series

Python `

import pandas as pd s1 = pd.Series([10, 20, 30], index=['a', 'b', 'c']) s2 = pd.Series([1, 2, 3], index=['a', 'b', 'c'])

Adding the two Series

result = s1 + s2 print(result)

`

Output

a 11 b 22 c 33 dtype: int64

2. **Comparison Operations on Series

Comparison operations return a Series of boolean values, indicating whether the comparison is True or False for each corresponding element.

**Example: Checking Equality

Python `

import pandas as pd s1 = pd.Series([10, 20, 30]) s2 = pd.Series([10, 25, 30])

Comparing the two Series

result = s1 == s2 print(result)

`

Output

0 True 1 False 2 True dtype: bool

Binary Operations on Pandas DataFrame

1. **Arithmetic Operations on DataFrames

Similar to Series, DataFrame arithmetic operations apply element-wise between two DataFrames.

**Note: The DataFrames must have the same shape or matching indexes and columns.

**Example: Subtracting DataFrames

Python `

import pandas as pd df1 = pd.DataFrame({'A': [10, 20, 30], 'B': [40, 50, 60]}) df2 = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]})

Subtracting the DataFrames

result = df1 - df2 print(result)

`

Output

A   B

0 9 36 1 18 45 2 27 54

2. **Comparison Operations on DataFrames

Like Series, comparison operations on DataFrames return a DataFrame of boolean values. These boolean values indicate whether the corresponding elements are equal or satisfy other comparison conditions.

**Example: Checking Greater Than

Python `

import pandas as pd df1 = pd.DataFrame({'A': [10, 20, 30], 'B': [40, 50, 60]}) df2 = pd.DataFrame({'A': [5, 15, 35], 'B': [30, 60, 55]})

Checking if elements of df1 are greater than df2

result = df1 > df2 print(result)

`

Output

   A      B

0 True True 1 True False 2 False True

2. **Comparison Operations on DataFrames

Like Series, comparison operations on DataFrames return a DataFrame of boolean values. These boolean values indicate whether the corresponding elements are equal or satisfy other comparison conditions.

**Example: Checking Greater Than

Python `

import pandas as pd df1 = pd.DataFrame({'A': [10, 20, 30], 'B': [40, 50, 60]}) df2 = pd.DataFrame({'A': [5, 15, 35], 'B': [30, 60, 55]})

Checking if elements of df1 are greater than df2

result = df1 > df2 print(result)

`

Output

   A      B

0 True True 1 True False 2 False True

Logical Operations on DataFrame and Series

Pandas also supports logical operations (AND, OR, etc.) on DataFrames and Series. These are commonly used for filtering and applying conditions.

**Example: Logical AND on Series

Python `

import pandas as pd s1 = pd.Series([True, False, True]) s2 = pd.Series([False, False, True])

Applying logical AND

result = s1 & s2 print(result)

`

Output

0 False 1 False 2 True dtype: bool

Handling Missing Data in Binary Operations

When performing binary operations on DataFrames or Series, missing data (NaN) can affect the results. Pandas handles missing data based on the operation:

**Example: Arithmetic with NaN

Python `

import pandas as pd df1 = pd.DataFrame({'A': [1, 2, None], 'B': [4, None, 6]}) df2 = pd.DataFrame({'A': [1, None, 3], 'B': [None, 5, 6]})

Adding the DataFrames

result = df1 + df2 print(result)

`

Output

 A     B

0 2.0 NaN 1 NaN NaN 2 NaN 12.0

As seen above, where there is missing data (None or NaN), the result becomes NaN.

By leveraging these operations, you can perform complex calculations, comparisons, and transformations on your data, making Pandas a powerful tool for data analysis.