The Most Pythonic Way to Check if a Python String Contains Another String? (Tutorial + Video) – Be on the Right Side of Change (original) (raw)

How to check if a Python string s1 contains another string s2? There are two easy ways to check whether string s1 contains string s2:

  1. Use the expression s2 in s1, or
  2. Use the find method s1.find(s2).

You can try both methods in our interactive Python shell (just click “Run” to execute the code in your browser):

Puzzle: Can you modify “Method 2” so that it also returns Boolean values?

A third more powerful method is to use Python regular expressions. In addition to that, a fourth method would be to use an efficient string-search algorithm such as Rabin-Karb.

However, this is only if you want to improve your code understanding skills rather than solving the string containment problem quickly for your own project. 😉

Use the Python Keyword “in”

This method may come most natural to you.

s1 = "Ronaldo is better than Messi"

print("Ronaldo" in s1)

True

print("Football" in s1)

False

You can use the in keyword to check containment for any Python iterable as well. It works in a similar way for lists, sets, and dictionaries.

Use the Python String Find() method

This method is slightly more powerful because it also returns the index of the searched substring.

s1 = "Ronaldo is better than Messi"

print(s1.find("Ronaldo"))

0

print(s1.find("Football"))

-1

print(s1.find("Messi"))

23

If the substring does not exist, the find() method returns the index -1. You have to be careful to handle this return value properly.

You can also specify the start and stop indices to limit the search to a certain (index) range:

s1 = "Ronaldo is better than Messi"

print(s1.find("Ronaldo", 5))

-1

print(s1.find("Football"))

-1

print(s1.find("Messi", 5, 100))

23

To summarize, you can use the Python built-in functionality s2 in s1 and s1.find(s2) to check whether string s1 contains string s2.

Most Powerful Way with Regular Expressions

Regular expressions are a powerful way to search patterns in strings. This is exactly what we want to do here. Our pattern is as simple as it can get: a simple string.

Here’s an example:

import re

s1 = "Ronaldo is better than Messi. Ronaldo really is better."

print(re.findall("Ronaldo", s1))

['Ronaldo', 'Ronaldo']

print(re.findall("Football", s1))

[]

print(re.findall("Messi", s1))

['Messi']

We use Python’s powerful regular expression library re. As you can see, the findall() method finds all occurrences of the string (not only one). If you are interested in the concrete indices, you can use the regex search() method that returns a match object with the start and stop indices of the found string:

import re

s1 = "Ronaldo is better than Messi. Ronaldo really is better."

print(re.search("Ronaldo", s1))

<re.Match object; span=(0, 7), match='Ronaldo'>

As you see, it only returns the first match of the 'Ronaldo' substring.

Related article: Python Regex Superpower – The Ultimate Guide

More Advanced Algorithms

There’s a large body of literature concerning the efficient search of strings. If you want to get an algorithmic overview, check out this excellent article about string-searching algorithms.

The naive string search algorithm simply iterates over all indices of the string s1. It then tries to match all characters of string s2. If it fails, it proceeds with the next index. However, this algorithm has O(len(s1) * len(s2)) worst-case time complexity.

The Rabin-Karb algorithm is more efficient. The improvement comes from using slicing to access the substring starting in index i and comparing the hash values instead of the substrings themselves. This is more efficient because calculating a hash value is much faster than checking equality of two strings. However, two different strings may result in the same hash value. Therefore, the Rabin-Karb algorithm needs to make sure to exclude this case. Here’s the Rabin-Karb algorithm in Python I created based on the pseudo-code given in the article:

def RabinKarp(string, pattern): n, m = len(string), len(pattern) hpattern = hash(pattern); for i in range(n-m+1): hs = hash(string[i:i+m]) if hs == hpattern: if string[i:i+m] == pattern: return i return -1

s1 = "Ronaldo is better than Messi"

print(RabinKarp(s1, "Ronaldo"))

0

print(RabinKarp(s1, "Football"))

-1

print(RabinKarp(s1, "Messi"))

23

You can visualize the execution of the RabinKarp algorithm in the following interactive memory simulator:

Click “Next” to see the next step unfold in your virtual memory! 🙂

Take your time to study this article carefully—it will boost your general code understanding.