How to Select Rows & Columns by Name or Index in Pandas Dataframe Using loc and iloc (original) (raw)
Last Updated : 28 Nov, 2024
When working with labeled data or referencing specific positions in a DataFrame, selecting specific rows and columns from Pandas DataFrame is important. In this article, we’ll focus on **pandas functions—loc and iloc—that allow you to select rows and columns either by their labels (names) or their integer positions (indexes).
Let's see an basic example to understand both methods:
Python `
import pandas as pd
data = {'Name': ['Alice', 'Bob', 'Charlie'],'Age': [25, 30, 35],'City': ['New York', 'Los Angeles', 'Chicago']} df = pd.DataFrame(data)
Using loc (label-based)
result_loc = df.loc[0, 'Name'] # Select value at row 0 and column 'Name'
Using iloc (position-based)
result_iloc = df.iloc[1, 2] # Select value at row 1 and column 2
print("Using loc:", result_loc)
print("Using iloc:", result_iloc)
`
**Output:
Using loc: Alice
Using iloc: Los Angeles
**Selecting Rows and Columns Using .loc[] (Label-Based Indexing)
The .loc[] method selects data based on labels (names of rows or columns). It is flexible and supports various operations like selecting single rows/columns, multiple rows/columns, or specific subsets.
**Key Features of .loc[]:
- Label-based indexing.
- Can select both rows and columns simultaneously.
- Supports slicing and filtering.
**Select a Single Row by Label:
Python `
row = df.loc[0] # Select the first row print(row)
`
**Output:
Name Alice
Age 25
City New York
Name: 0, dtype: object
**Select Multiple Rows by Labels:
Python `
rows = df.loc[[0, 2]] # Select rows with index labels 0 and 2 print(rows)
`
**Output:
Name Age City 0 Alice 25 New York
2 Charlie 35 Chicago
**Select Specific Rows and Columns:
Python `
subset = df.loc[0:1, ['Name', 'City']] # Select first two rows and specific columns print(subset)
`
**Output:
Name City 0 Alice New York
1 Bob Los Angeles
**Filter Rows Based on Conditions:
Python `
filtered = df.loc[df['Age'] > 25] # Select rows where Age > 25 print(filtered)
`
**Output:
Name Age City 1 Bob 30 Los Angeles
2 Charlie 35 Chicago
**Selecting Rows and Columns Using .iloc[] (Integer-Position Based Indexing)
The ****.iloc[] method** selects data based on integer positions (index numbers). It is particularly useful when you don’t know the labels but know the positions.
**Key Features of .iloc[]:
- Uses integer positions (0, 1, 2, ...) to index rows and columns.
- Just like .loc[], you can pass a range or a list of indices.
- Supports slicing, similar to Python lists.
- Unlike .loc[], it is exclusive when indexing ranges, meaning that the end index is excluded.
**Select a Single Row by Position:
Python `
row = df.iloc[1] # Select the second row (index position = 1) print(row)
`
**Output:
Name Bob
Age 30
City Los Angeles
Name: 1, dtype: object
**Select Multiple Rows by Positions:
Python `
rows = df.iloc[[0, 2]] # Select first and third rows by position print(rows)
`
**Output:
Name Age City 0 Alice 25 New York
2 Charlie 35 Chicago
**Select Specific Rows and Columns:
Python `
import pandas as pd
data = {'Name': ['Alice', 'Bob', 'Charlie'],'Age': [25, 30, 35],'City': ['New York', 'Los Angeles', 'Chicago']} df = pd.DataFrame(data) subset=df.iloc[[0,2],[1]] print(subset)
`
**Output:
Age
0 25
2 35
Key Differences Between loc and iloc
Although both loc and**iloc**allow row and column selection, they differ in how they handle indexes:
| **Feature | **loc | **iloc |
|---|---|---|
| **Indexing Basis | Label-based (uses row/column labels) | Position-based (uses integer positions) |
| **Inclusiveness | Inclusive of both start and end points | Exclusive of the end point (slicing) |
| **Row Selection | Works with index labels (could be strings) | Works with integer positions (0, 1, 2, ...) |
| **Column Selection | Works with column labels (can be strings) | Works with integer positions (0, 1, 2, ...) |
Use loc when:
- You have specific labels for rows or columns and want to work with them directly.
- Your dataset has non-numeric indexes (e.g., strings, datetime).
Use iloc when:
- You are working with numeric positions and don’t need to reference row/column names directly.
- You want to select data based on positions within the structure.
Both loc and iloc are incredibly useful tools for selecting specific data in a Pandas DataFrame. The key difference is whether you're selecting by label (loc) or index position (iloc). Understanding how and when to use these methods is essential for efficient data manipulation.