Python Extract K length substrings (original) (raw)
Last Updated : 15 Jan, 2025
The task is to extract all possible substrings of a specific length, k. This problem involves identifying and retrieving those substrings in an efficient way. Let's explore various methods to extract substrings of length k from a given string in Python
Using List Comprehension
List comprehension is the most efficient and concise way to extract substrings. It iterates through the string and collects all substrings of length k in a single line.
Python `
s = "geeksforgeeks" k = 4
Extracting k-length substrings using list comprehension
sub = [s[i:i+k] for i in range(len(s) - k + 1)] print(sub)
`
Output
['geek', 'eeks', 'eksf', 'ksfo', 'sfor', 'forg', 'orge', 'rgee', 'geek', 'eeks']
Explanation:
- List comprehension iterates over the string and slices it to get all substrings of length k.
- The range ensures we only extract valid substrings within the bounds of the string.
Let's explore some more methods and see how we can extract K length substrings from a given string.
Table of Content
Using a for loop
By using a simple for loop, we can extract the substrings by iterating through the string and slicing it at each step. This method is very intuitive.
Python `
s = "geeksforgeeks" k = 4
Initialize an empty list to store substrings
sub = []
Loop through the string and extract k-length substrings
for i in range(len(s) - k + 1): sub.append(s[i:i+k]) print(sub)
`
Output
['geek', 'eeks', 'eksf', 'ksfo', 'sfor', 'forg', 'orge', 'rgee', 'geek', 'eeks']
Explanation:
- A for loop iterates through the string, slicing substrings of length k.
- Each substring is appended to the
substrings
list.
Using zip()
zip() function can also be used for this task, we can use it to group characters of the string into k-length substrings.
Python `
s = "geeksforgeeks" k = 4
Use zip to extract k-length substrings
substrings = [''.join(s[i:i+k]) for i in range(len(s) - k + 1)] print(substrings)
`
Output
['geek', 'eeks', 'eksf', 'ksfo', 'sfor', 'forg', 'orge', 'rgee', 'geek', 'eeks']
Explanation:
- The zip() function groups characters in chunks of size k.
- A list comprehension is used to extract and join these chunks into substrings.
Using re.findall()
re.findall() function can be used to extract substrings based on regular expressions.
Python `
import re s = "geeksforgeeks" k = 4
Use re.findall to extract k-length substrings
subs = re.findall(r'(?=(\w{4}))', s) print(subs)
`
Output
['geek', 'eeks', 'eksf', 'ksfo', 'sfor', 'forg', 'orge', 'rgee', 'geek', 'eeks']
Explanation:
- The regular expression
(?=(\w{4}))
matches overlapping substrings of length 4. - This method uses regular expressions, which may be overkill for simple tasks.