Pandas DataFrame.loc[] Method (original) (raw)

Last Updated : 01 Dec, 2023

Pandas DataFrame is a two-dimensional size-mutable, potentially heterogeneous tabular data structure with labeled axes (rows and columns). Arithmetic operations align on both row and column labels. It can be thought of as a dict-like container for Series objects. This is the primary data structure of the Pandas.

Pandas DataFrame loc[] Syntax

Pandas DataFrame.loc attribute accesses a group of rows and columns by label(s) or a boolean array in the given Pandas DataFrame.

**Syntax: DataFrame.loc

**Parameter : None

**Returns : Scalar, Series, DataFrame

Pandas DataFrame loc Property

Below are some examples by which we can use Pandas DataFrame loc[]:

**Example 1: Select a Single Row and Column By Label using loc[]

Use DataFrame.loc attribute to access a particular cell in the given Pandas Dataframe using the index and column labels. We are then selecting a single row and column by label using loc[].

Python3 `

importing pandas as pd

import pandas as pd

Creating the DataFrame

df = pd.DataFrame({'Weight': [45, 88, 56, 15, 71], 'Name': ['Sam', 'Andrea', 'Alex', 'Robin', 'Kia'], 'Age': [14, 25, 55, 8, 21]})

Create the index

index_ = ['Row_1', 'Row_2', 'Row_3', 'Row_4', 'Row_5']

Set the index

df.index = index_

Print the DataFrame

print("Original DataFrame:") print(df)

Corrected selection using loc for a specific cell

result = df.loc['Row_2', 'Name']

Print the result

print("\nSelected Value at Row_2, Column 'Name':") print(result)

`

**Output

Original DataFrame:
Weight Name Age
Row_1 45 Sam 14
Row_2 88 Andrea 25
Row_3 56 Alex 55
Row_4 15 Robin 8
Row_5 71 Kia 21
Selected Value at Row_2, Column 'Name':
Andrea

**Example 2: Select Multiple Rows and Columns

Use DataFrame.loc attribute to return two of the column in the given Dataframe and then selecting multiple rows and columns as done in the below example.

Python3 `

importing pandas as pd

import pandas as pd

Creating the DataFrame

df = pd.DataFrame({"A":[12, 4, 5, None, 1], "B":[7, 2, 54, 3, None], "C":[20, 16, 11, 3, 8], "D":[14, 3, None, 2, 6]})

Create the index

index_ = ['Row_1', 'Row_2', 'Row_3', 'Row_4', 'Row_5']

Set the index

df.index = index_

Print the DataFrame

print("Original DataFrame:") print(df)

Corrected column names ('A' and 'D') in the result

result = df.loc[:, ['A', 'D']]

Print the result

print("\nSelected Columns 'A' and 'D':") print(result)

`

**Output

Original DataFrame:
A B C D
Row_1 12.0 7 20 14.0
Row_2 4.0 2 16 3.0
Row_3 5.0 54 11 NaN
Row_4 NaN 3 3 2.0
Row_5 1.0 NaN 8 6.0
Selected Columns 'A' and 'D':
A D
Row_1 12.0 14.0
Row_2 4.0 3.0
Row_3 5.0 NaN
Row_4 NaN 2.0
Row_5 1.0 6.0

Example 3: Select Between Two Rows or Columns

In this example, we creates a pandas DataFrame named 'df', sets custom row indices, and then uses the loc accessor to select rows between 'Row_2' and 'Row_4' inclusive and columns 'B' through 'D'. The selected rows and columns are printed, demonstrating the use of label-based indexing with loc.

Python3 `

importing pandas as pd

import pandas as pd

Creating the DataFrame

df = pd.DataFrame({"A": [12, 4, 5, None, 1], "B": [7, 2, 54, 3, None], "C": [20, 16, 11, 3, 8], "D": [14, 3, None, 2, 6]})

Create the index

index_ = ['Row_1', 'Row_2', 'Row_3', 'Row_4', 'Row_5']

Set the index

df.index = index_

Print the original DataFrame

print("Original DataFrame:") print(df)

Select Rows Between 'Row_2' and 'Row_4'

selected_rows = df.loc['Row_2':'Row_4'] print("\nSelected Rows:") print(selected_rows)

Select Columns 'B' through 'D'

selected_columns = df.loc[:, 'B':'D'] print("\nSelected Columns:") print(selected_columns)

`

**Output

Original DataFrame:
A B C D
Row_1 12.0 7 20 14.0
Row_2 4.0 2 16 3.0
Row_3 5.0 54 11 NaN
Row_4 NaN 3 3 2.0
Row_5 1.0 NaN 8 6.0
Selected Rows:
A B C D
Row_2 4 2 16 3.0
Row_3 5 54 11 NaN
Row_4 NaN 3 3 2.0
Selected Columns:
B C D
Row_1 7 20 14.0
Row_2 2 16 3.0
Row_3 54 11 NaN
Row_4 3 3 2.0
Row_5 NaN 8 6.0

Example 4: Select Alternate Rows or Columns

In this example, we creates a pandas DataFrame named 'df', sets custom row indices, and then uses the iloc accessor to select alternate rows (every second row) and alternate columns (every second column). The resulting selections are printed, showcasing the use of integer-based indexing with iloc.

Python3 `

importing pandas as pd

import pandas as pd

Creating the DataFrame

df = pd.DataFrame({"A": [12, 4, 5, None, 1], "B": [7, 2, 54, 3, None], "C": [20, 16, 11, 3, 8], "D": [14, 3, None, 2, 6]})

Create the index

index_ = ['Row_1', 'Row_2', 'Row_3', 'Row_4', 'Row_5']

Set the index

df.index = index_

Print the original DataFrame

print("Original DataFrame:") print(df)

Select Alternate Rows

alternate_rows = df.iloc[::2] print("\nAlternate Rows:") print(alternate_rows)

Select Alternate Columns

alternate_columns = df.iloc[:, ::2] print("\nAlternate Columns:") print(alternate_columns)

`

**Output

Original DataFrame:
A B C D
Row_1 12.0 7 20 14.0
Row_2 4.0 2 16 3.0
Row_3 5.0 54 11 NaN
Row_4 NaN 3 3 2.0
Row_5 1.0 NaN 8 6.0
Alternate Rows:
A B C D
Row_1 12.0 7 20 14.0
Row_3 5.0 54 11 NaN
Row_5 1.0 NaN 8 6.0
Alternate Columns:
A C
Row_1 12.0 20
Row_2 4.0 16
Row_3 5.0 11
Row_4 NaN 3
Row_5 1.0 8

Example 5: Using Conditions with Pandas loc

In this example, we are creating a pandas DataFrame named 'df', sets custom row indices, and utilizes the loc accessor to select rows based on conditions. It demonstrates selecting rows where column 'A' has values greater than 5 and selecting rows where column 'B' is not null. The resulting selections are then printed, showcasing the use of conditional filtering with loc.

Python3 `

importing pandas as pd

import pandas as pd

Creating the DataFrame

df = pd.DataFrame({"A": [12, 4, 5, None, 1], "B": [7, 2, 54, 3, None], "C": [20, 16, 11, 3, 8], "D": [14, 3, None, 2, 6]})

Create the index

index_ = ['Row_1', 'Row_2', 'Row_3', 'Row_4', 'Row_5']

Set the index

df.index = index_

Print the original DataFrame

print("Original DataFrame:") print(df)

Using Conditions with loc

Example: Select rows where column 'A' is greater than 5

selected_rows = df.loc[df['A'] > 5] print("\nRows where column 'A' is greater than 5:") print(selected_rows)

Example: Select rows where column 'B' is not null

non_null_rows = df.loc[df['B'].notnull()] print("\nRows where column 'B' is not null:") print(non_null_rows)

`

**Output

Original DataFrame:
A B C D
Row_1 12.0 7 20 14.0
Row_2 4.0 2 16 3.0
Row_3 5.0 54 11 NaN
Row_4 NaN 3 3 2.0
Row_5 1.0 NaN 8 6.0
Rows where column 'A' is greater than 5:
A B C D
Row_1 12.0 7 20 14.0
Row_3 5.0 54 11 NaN
Rows where column 'B' is not null:
A B C D
Row_1 12.0 7 20 14.0
Row_2 4.0 2 16 3.0
Row_3 5.0 54 11 NaN
Row_4 NaN 3 3 2.0