Get all rows in a Pandas DataFrame containing given substring (original) (raw)
Last Updated : 24 Dec, 2018
Let’s see how to get all rows in a Pandas DataFrame containing given substring with the help of different examples.
Code #1: Check the values PG in column Position
import
pandas as pd
df
=
pd.DataFrame({
'Name'
: [
'Geeks'
,
'Peter'
,
'James'
,
'Jack'
,
'Lisa'
],
`` 'Team'
: [
'Boston'
,
'Boston'
,
'Boston'
,
'Chele'
,
'Barse'
],
`` 'Position'
: [
'PG'
,
'PG'
,
'UG'
,
'PG'
,
'UG'
],
`` 'Number'
: [
3
,
4
,
7
,
11
,
5
],
`` 'Age'
: [
33
,
25
,
34
,
35
,
28
],
`` 'Height'
: [
'6-2'
,
'6-4'
,
'5-9'
,
'6-1'
,
'5-8'
],
`` 'Weight'
: [
89
,
79
,
113
,
78
,
84
],
`` 'College'
: [
'MIT'
,
'MIT'
,
'MIT'
,
'Stanford'
,
'Stanford'
],
`` 'Salary'
: [
99999
,
99994
,
89999
,
78889
,
87779
]},
`` index
=
[
'ind1'
,
'ind2'
,
'ind3'
,
'ind4'
,
'ind5'
])
print
(df,
"\n"
)
print
(
"Check PG values in Position column:\n"
)
df1
=
df[
'Position'
].
str
.contains(
"PG"
)
print
(df1)
Output:
But this result doesn’t seem very helpful, as it returns the bool values with the index. Let’s see if we can do something better.
Code #2: Getting the rows satisfying condition
import
pandas as pd
df
=
pd.DataFrame({
'Name'
: [
'Geeks'
,
'Peter'
,
'James'
,
'Jack'
,
'Lisa'
],
`` 'Team'
: [
'Boston'
,
'Boston'
,
'Boston'
,
'Chele'
,
'Barse'
],
`` 'Position'
: [
'PG'
,
'PG'
,
'UG'
,
'PG'
,
'UG'
],
`` 'Number'
: [
3
,
4
,
7
,
11
,
5
],
`` 'Age'
: [
33
,
25
,
34
,
35
,
28
],
`` 'Height'
: [
'6-2'
,
'6-4'
,
'5-9'
,
'6-1'
,
'5-8'
],
`` 'Weight'
: [
89
,
79
,
113
,
78
,
84
],
`` 'College'
: [
'MIT'
,
'MIT'
,
'MIT'
,
'Stanford'
,
'Stanford'
],
`` 'Salary'
: [
99999
,
99994
,
89999
,
78889
,
87779
]},
`` index
=
[
'ind1'
,
'ind2'
,
'ind3'
,
'ind4'
,
'ind5'
])
df1
=
df[df[
'Position'
].
str
.contains(
"PG"
)]
print
(df1)
Output:
Code #3: Filter all rows where either Team contains ‘Boston’ or College contains ‘MIT’.
import
pandas as pd
df
=
pd.DataFrame({
'Name'
: [
'Geeks'
,
'Peter'
,
'James'
,
'Jack'
,
'Lisa'
],
`` 'Team'
: [
'Boston'
,
'Boston'
,
'Boston'
,
'Chele'
,
'Barse'
],
`` 'Position'
: [
'PG'
,
'PG'
,
'UG'
,
'PG'
,
'UG'
],
`` 'Number'
: [
3
,
4
,
7
,
11
,
5
],
`` 'Age'
: [
33
,
25
,
34
,
35
,
28
],
`` 'Height'
: [
'6-2'
,
'6-4'
,
'5-9'
,
'6-1'
,
'5-8'
],
`` 'Weight'
: [
89
,
79
,
113
,
78
,
84
],
`` 'College'
: [
'MIT'
,
'MIT'
,
'MIT'
,
'Stanford'
,
'Stanford'
],
`` 'Salary'
: [
99999
,
99994
,
89999
,
78889
,
87779
]},
`` index
=
[
'ind1'
,
'ind2'
,
'ind3'
,
'ind4'
,
'ind5'
])
df1
=
df[df[
'Team'
].
str
.contains(
"Boston"
) | df[
'College'
].
str
.contains(
'MIT'
)]
print
(df1)
Output:
Code #4: Filter rows checking Team name contains ‘Boston and Position must be PG.
Output:
Code #5: Filter rows checking Position contains PG and College must contains like UC.
Output: