How to Select Rows & Columns by Name or Index in Pandas Dataframe Using loc and iloc (original) (raw)

Last Updated : 28 Nov, 2024

When working with labeled data or referencing specific positions in a DataFrame, selecting specific rows and columns from Pandas DataFrame is important. In this article, we’ll focus on **pandas functions—loc and iloc—that allow you to select rows and columns either by their labels (names) or their integer positions (indexes).

Let's see an basic example to understand both methods:

Python `

import pandas as pd

data = {'Name': ['Alice', 'Bob', 'Charlie'],'Age': [25, 30, 35],'City': ['New York', 'Los Angeles', 'Chicago']} df = pd.DataFrame(data)

Using loc (label-based)

result_loc = df.loc[0, 'Name'] # Select value at row 0 and column 'Name'

Using iloc (position-based)

result_iloc = df.iloc[1, 2] # Select value at row 1 and column 2

print("Using loc:", result_loc)
print("Using iloc:", result_iloc)

`

**Output:

Using loc: Alice
Using iloc: Los Angeles

**Selecting Rows and Columns Using .loc[] (Label-Based Indexing)

The .loc[] method selects data based on labels (names of rows or columns). It is flexible and supports various operations like selecting single rows/columns, multiple rows/columns, or specific subsets.

**Key Features of .loc[]:

**Select a Single Row by Label:

Python `

row = df.loc[0] # Select the first row print(row)

`

**Output:

Name Alice
Age 25
City New York
Name: 0, dtype: object

**Select Multiple Rows by Labels:

Python `

rows = df.loc[[0, 2]] # Select rows with index labels 0 and 2 print(rows)

`

**Output:

  Name  Age      City  

0 Alice 25 New York
2 Charlie 35 Chicago

**Select Specific Rows and Columns:

Python `

subset = df.loc[0:1, ['Name', 'City']] # Select first two rows and specific columns print(subset)

`

**Output:

Name         City  

0 Alice New York
1 Bob Los Angeles

**Filter Rows Based on Conditions:

Python `

filtered = df.loc[df['Age'] > 25] # Select rows where Age > 25 print(filtered)

`

**Output:

  Name  Age         City  

1 Bob 30 Los Angeles
2 Charlie 35 Chicago

**Selecting Rows and Columns Using .iloc[] (Integer-Position Based Indexing)

The ****.iloc[] method** selects data based on integer positions (index numbers). It is particularly useful when you don’t know the labels but know the positions.

**Key Features of .iloc[]:

**Select a Single Row by Position:

Python `

row = df.iloc[1] # Select the second row (index position = 1) print(row)

`

**Output:

Name Bob
Age 30
City Los Angeles
Name: 1, dtype: object

**Select Multiple Rows by Positions:

Python `

rows = df.iloc[[0, 2]] # Select first and third rows by position print(rows)

`

**Output:

  Name  Age      City  

0 Alice 25 New York
2 Charlie 35 Chicago

**Select Specific Rows and Columns:

Python `

import pandas as pd

data = {'Name': ['Alice', 'Bob', 'Charlie'],'Age': [25, 30, 35],'City': ['New York', 'Los Angeles', 'Chicago']} df = pd.DataFrame(data) subset=df.iloc[[0,2],[1]] print(subset)

`

**Output:

Age
0 25
2 35

Key Differences Between loc and iloc

Although both loc and**iloc**allow row and column selection, they differ in how they handle indexes:

**Feature **loc **iloc
**Indexing Basis Label-based (uses row/column labels) Position-based (uses integer positions)
**Inclusiveness Inclusive of both start and end points Exclusive of the end point (slicing)
**Row Selection Works with index labels (could be strings) Works with integer positions (0, 1, 2, ...)
**Column Selection Works with column labels (can be strings) Works with integer positions (0, 1, 2, ...)

Use loc when:

Use iloc when:

Both loc and iloc are incredibly useful tools for selecting specific data in a Pandas DataFrame. The key difference is whether you're selecting by label (loc) or index position (iloc). Understanding how and when to use these methods is essential for efficient data manipulation.