How to Select Rows & Columns by Name or Index in Pandas Dataframe Using loc and iloc (original) (raw)

Last Updated : 28 Nov, 2024

When working with labeled data or referencing specific positions in a DataFrame, selecting specific rows and columns from Pandas DataFrame is important. In this article, we’ll focus on **pandas functions—loc and iloc—that allow you to select rows and columns either by their labels (names) or their integer positions (indexes).

Let's see an basic example to understand both methods:

Python `

import pandas as pd

data = {'Name': ['Alice', 'Bob', 'Charlie'],'Age': [25, 30, 35],'City': ['New York', 'Los Angeles', 'Chicago']} df = pd.DataFrame(data)

Using loc (label-based)

result_loc = df.loc[0, 'Name'] # Select value at row 0 and column 'Name'

Using iloc (position-based)

result_iloc = df.iloc[1, 2] # Select value at row 1 and column 2

print("Using loc:", result_loc)
print("Using iloc:", result_iloc)

**Output:

Using loc: Alice
Using iloc: Los Angeles

**Selecting Rows and Columns Using .loc[] (Label-Based Indexing)

The .loc[] method selects data based on labels (names of rows or columns). It is flexible and supports various operations like selecting single rows/columns, multiple rows/columns, or specific subsets.

**Key Features of .loc[]:

Label-based indexing.
Can select both rows and columns simultaneously.
Supports slicing and filtering.

**Select a Single Row by Label:

Python `

row = df.loc[0] # Select the first row print(row)

**Output:

Name Alice
Age 25
City New York
Name: 0, dtype: object

**Select Multiple Rows by Labels:

Python `

rows = df.loc[[0, 2]] # Select rows with index labels 0 and 2 print(rows)

**Output:

  Name  Age      City

0 Alice 25 New York
2 Charlie 35 Chicago

**Select Specific Rows and Columns:

Python `

subset = df.loc[0:1, ['Name', 'City']] # Select first two rows and specific columns print(subset)

**Output:

Name         City

0 Alice New York
1 Bob Los Angeles

**Filter Rows Based on Conditions:

Python `

filtered = df.loc[df['Age'] > 25] # Select rows where Age > 25 print(filtered)

**Output:

  Name  Age         City

1 Bob 30 Los Angeles
2 Charlie 35 Chicago

**Selecting Rows and Columns Using .iloc[] (Integer-Position Based Indexing)

The ****.iloc[] method** selects data based on integer positions (index numbers). It is particularly useful when you don’t know the labels but know the positions.

**Key Features of .iloc[]:

Uses integer positions (0, 1, 2, ...) to index rows and columns.
Just like .loc[], you can pass a range or a list of indices.
Supports slicing, similar to Python lists.
Unlike .loc[], it is exclusive when indexing ranges, meaning that the end index is excluded.

**Select a Single Row by Position:

Python `

row = df.iloc[1] # Select the second row (index position = 1) print(row)

**Output:

Name Bob
Age 30
City Los Angeles
Name: 1, dtype: object

**Select Multiple Rows by Positions:

Python `

rows = df.iloc[[0, 2]] # Select first and third rows by position print(rows)

**Output:

  Name  Age      City

0 Alice 25 New York
2 Charlie 35 Chicago

**Select Specific Rows and Columns:

Python `

import pandas as pd

data = {'Name': ['Alice', 'Bob', 'Charlie'],'Age': [25, 30, 35],'City': ['New York', 'Los Angeles', 'Chicago']} df = pd.DataFrame(data) subset=df.iloc[[0,2],[1]] print(subset)

**Output:

Age
0 25
2 35

Key Differences Between `loc` and `iloc`

Although both loc and**iloc**allow row and column selection, they differ in how they handle indexes:

**Feature	**loc	**iloc
**Indexing Basis	Label-based (uses row/column labels)	Position-based (uses integer positions)
**Inclusiveness	Inclusive of both start and end points	Exclusive of the end point (slicing)
**Row Selection	Works with index labels (could be strings)	Works with integer positions (0, 1, 2, ...)
**Column Selection	Works with column labels (can be strings)	Works with integer positions (0, 1, 2, ...)

Use `loc` when:

You have specific labels for rows or columns and want to work with them directly.
Your dataset has non-numeric indexes (e.g., strings, datetime).

Use `iloc` when:

You are working with numeric positions and don’t need to reference row/column names directly.
You want to select data based on positions within the structure.

Both loc and iloc are incredibly useful tools for selecting specific data in a Pandas DataFrame. The key difference is whether you're selecting by label (loc) or index position (iloc). Understanding how and when to use these methods is essential for efficient data manipulation.

How to Select Rows & Columns by Name or Index in Pandas Dataframe Using loc and iloc (original) (raw)

Using loc (label-based)

Using iloc (position-based)

**Selecting Rows and Columns Using .loc[] (Label-Based Indexing)

**Select a Single Row by Label:

**Select Multiple Rows by Labels:

**Select Specific Rows and Columns:

**Filter Rows Based on Conditions:

**Selecting Rows and Columns Using .iloc[] (Integer-Position Based Indexing)

**Select a Single Row by Position:

**Select Multiple Rows by Positions:

**Select Specific Rows and Columns:

Key Differences Between loc and iloc

Use loc when:

Use iloc when:

Key Differences Between `loc` and `iloc`

Use `loc` when:

Use `iloc` when: