Remove duplicate words from Sentence using Regular Expression (original) (raw)

Last Updated : 12 Jul, 2025

Given a string **str which represents a sentence, the task is to remove the duplicate words from sentences using regular Expression in Programming Languages like C++, Java, C#, Python, etc.

**Examples of Remove Duplicate Words from Sentences

**Input: str = "Good bye bye world world"
**Output: Good bye world
**Explanation: We remove the second occurrence of _bye and _world from Good bye bye world world

**Input: str = "Ram went went to to to his home"
**Output: Ram went to his home
**Explanation: We remove the second occurrence of _went and the second and third occurrences of _to from Ram went went to to to his home.

**Input: str = "Hello hello world world"
**Output: Hello world
**Explanation: We remove the second occurrence of _hello and _world from Hello hello world world.

**Approach

1. Get the sentence.
2. Form a regular expression to remove duplicate words from sentences.

regex = "\b(\w+)(?:\W+\1\b)+";

The details of the above regular expression can be understood as:

3. Match the sentence with the Regex. In Java, this can be done using Pattern.matcher().
4. return the modified sentence.

**Below is the implementation of the above approach:

C++ `

// C++ program to remove duplicate words // using Regular Expression or ReGex. #include #include using namespace std;

// Function to validate the sentence // and remove the duplicate words string removeDuplicateWords(string s) {

// Regex to matching repeated words. const regex pattern("\b(\w+)(?:\W+\1\b)+", regex_constants::icase);

string answer = s; for (auto it = sregex_iterator(s.begin(), s.end(), pattern); it != sregex_iterator(); it++) { // flag type for determining the matching behavior // here it is for matches on 'string' objects smatch match; match = *it; answer.replace(answer.find(match.str(0)), match.str(0).length(), match.str(1)); }

return answer; }

// Driver Code int main() { // Test Case: 1 string str1 = "Good bye bye world world"; cout << removeDuplicateWords(str1) << endl;

// Test Case: 2 string str2 = "Ram went went to to his home"; cout << removeDuplicateWords(str2) << endl;

// Test Case: 3 string str3 = "Hello hello world world"; cout << removeDuplicateWords(str3) << endl;

return 0; }

// This code is contributed by yuvraj_chandra

Java

// Java program to remove duplicate words // Using Regular Expression or ReGex. import java.util.regex.Matcher; import java.util.regex.Pattern;

// Driver Class class GFG { // Function to validate the sentence // and remove the duplicate words public static String removeDuplicateWords(String input) { // Regex to matching repeated words. String regex = "\b(\w+)(?:\W+\1\b)+"; Pattern p = Pattern.compile(regex,Pattern.CASE_INSENSITIVE);

    // Pattern class contains matcher() method
    // to find matching between given sentence
    // and regular expression.
    Matcher m = p.matcher(input);

    // Check for subsequences of input
    // that match the compiled pattern
    while (m.find()) {
        input = input.replaceAll( m.group(), m.group(1));
    }
    return input;
}

// Driver code
public static void main(String args[])
{
    // Test Case: 1
    String str1 = "Good bye bye world world";
    System.out.println(removeDuplicateWords(str1));

    // Test Case: 2
    String str2 = "Ram went went to to his home";
    System.out.println(removeDuplicateWords(str2));

    // Test Case: 3
    String str3 = "Hello hello world world";
    System.out.println( removeDuplicateWords(str3));
}

}

Python3

Python program to remove duplicate words

using Regular Expression or ReGex.

import re

Function to validate the sentence

and remove the duplicate words

def removeDuplicateWords(input):

# Regex to matching repeated words
regex = r'\b(\w+)(?:\W+\1\b)+'

return re.sub(regex, r'\1', input, flags=re.IGNORECASE)

Driver Code

Test Case: 1

str1 = "Good bye bye world world" print(removeDuplicateWords(str1))

Test Case: 2

str2 = "Ram went went to to his home" print(removeDuplicateWords(str2))

Test Case: 3

str3 = "Hello hello world world" print(removeDuplicateWords(str3))

This code is contributed by yuvraj_chandra

C#

using System; using System.Text.RegularExpressions;

class Program { // Function to validate the sentence // and remove the duplicate words static string RemoveDuplicateWords(string s) { // Regex to matching repeated words. Regex pattern = new Regex(@"\b(\w+)(?:\W+\1\b)+", RegexOptions.IgnoreCase);

    string answer = s;
    MatchCollection matches = pattern.Matches(s);

    foreach (Match match in matches)
    {
        answer = answer.Replace(match.Groups[0].Value, match.Groups[1].Value);
    }

    return answer;
}

// Driver Code
static void Main()
{
    // Test Case: 1
    string str1 = "Good bye bye world world";
    Console.WriteLine(RemoveDuplicateWords(str1));

    // Test Case: 2
    string str2 = "Ram went went to to his home";
    Console.WriteLine(RemoveDuplicateWords(str2));

    // Test Case: 3
    string str3 = "Hello hello world world";
    Console.WriteLine(RemoveDuplicateWords(str3));
}

}

JavaScript

// Function to remove duplicate words using Regular Expression function removeDuplicateWords(input) { // Regular expression to match repeated words let regex = /\b(\w+)(?:\W+\1\b)+/gi;

// Replace duplicate words with the first occurrence
return input.replace(regex, '$1');

}

// Test cases // Test Case: 1 let str1 = "Good bye bye world world"; console.log(removeDuplicateWords(str1));

// Test Case: 2 let str2 = "Ram went went to to his home"; console.log(removeDuplicateWords(str2));

// Test Case: 3 let str3 = "Hello hello world world"; console.log(removeDuplicateWords(str3));

`

Output

Good bye world Ram went to his home Hello world

Complexity of the above Programs

**Time Complexity : O(n), where n is length of string
**Auxiliary Space : O(1)