Python Extract K length substrings (original) (raw)

Last Updated : 15 Jan, 2025

The task is to extract all possible substrings of a specific length, k. This problem involves identifying and retrieving those substrings in an efficient way. Let's explore various methods to extract substrings of length k from a given string in Python

Using List Comprehension

List comprehension is the most efficient and concise way to extract substrings. It iterates through the string and collects all substrings of length k in a single line.

Python `

s = "geeksforgeeks" k = 4

Extracting k-length substrings using list comprehension

sub = [s[i:i+k] for i in range(len(s) - k + 1)] print(sub)

`

Output

['geek', 'eeks', 'eksf', 'ksfo', 'sfor', 'forg', 'orge', 'rgee', 'geek', 'eeks']

Explanation:

Let's explore some more methods and see how we can extract K length substrings from a given string.

Table of Content

Using a for loop

By using a simple for loop, we can extract the substrings by iterating through the string and slicing it at each step. This method is very intuitive.

Python `

s = "geeksforgeeks" k = 4

Initialize an empty list to store substrings

sub = []

Loop through the string and extract k-length substrings

for i in range(len(s) - k + 1): sub.append(s[i:i+k]) print(sub)

`

Output

['geek', 'eeks', 'eksf', 'ksfo', 'sfor', 'forg', 'orge', 'rgee', 'geek', 'eeks']

Explanation:

Using zip()

zip() function can also be used for this task, we can use it to group characters of the string into k-length substrings.

Python `

s = "geeksforgeeks" k = 4

Use zip to extract k-length substrings

substrings = [''.join(s[i:i+k]) for i in range(len(s) - k + 1)] print(substrings)

`

Output

['geek', 'eeks', 'eksf', 'ksfo', 'sfor', 'forg', 'orge', 'rgee', 'geek', 'eeks']

Explanation:

Using re.findall()

re.findall() function can be used to extract substrings based on regular expressions.

Python `

import re s = "geeksforgeeks" k = 4

Use re.findall to extract k-length substrings

subs = re.findall(r'(?=(\w{4}))', s) print(subs)

`

Output

['geek', 'eeks', 'eksf', 'ksfo', 'sfor', 'forg', 'orge', 'rgee', 'geek', 'eeks']

Explanation: