How to split a string in C/C++, Python and Java? (original) (raw)

Splitting a string by some delimiter is a very common task. For example, we have a comma-separated list of items from a file and we want individual items in an array.
Almost all programming languages, provide a function split a string by some delimiter.

**In C:

// Splits str[] according to given delimiters.
// and returns next token. It needs to be called
// in a loop to get all tokens. It returns NULL
// when there are no more tokens.
char * strtok(char str[], const char *delims);

C `

// A C/C++ program for splitting a string // using strtok() #include <stdio.h> #include <string.h>

int main() { char str[] = "Geeks-for-Geeks";

// Returns first token 
char *token = strtok(str, "-");

// Keep printing tokens while one of the
// delimiters present in str[].
while (token != NULL)
{
    printf("%s\n", token);
    token = strtok(NULL, "-");
}

return 0;

}

`

Output: Geeks
for
Geeks

**Time complexity : O(n)

**Auxiliary Space: O(n)

**In C++

Note: The main disadvantage of strtok() is that it only works for C style strings.
Therefore we need to explicitly convert C++ string into a char array.
Many programmers are unaware that C++ has two additional APIs which are more elegant
and works with C++ string.

**Method 1: Using stringstream API of C++

**Prerequisite: stringstream API

Stringstream object can be initialized using a string object, it automatically **tokenizes strings on space char. Just like "cin" stream stringstream allows you to read a string as a stream of words. Alternately, we can also utilise getline function to tokenize string on **any single character delimiter.

Some of the Most Common used functions of StringStream.
clear() — flushes the stream
str() — converts a stream of words into a C++ string object.
operator << — pushes a string object into the stream.
operator >> — extracts a word from the stream.

The code below demonstrates it.

C++ `

#include <bits/stdc++.h> using namespace std;

// A quick way to split strings separated via spaces. void simple_tokenizer(string s) { stringstream ss(s); string word; while (ss >> word) { cout << word << endl; } }

// A quick way to split strings separated via any character // delimiter. void adv_tokenizer(string s, char del) { stringstream ss(s); string word; while (!ss.eof()) { getline(ss, word, del); cout << word << endl; } }

int main(int argc, char const* argv[]) { string a = "How do you do!"; string b = "How$do$you$do!"; // Takes only space separated C++ strings. simple_tokenizer(a); cout << endl; adv_tokenizer(b, '$'); cout << endl; return 0; }

`

Output : How
do
you
do!

Time Complexity: O(n)

**Auxiliary Space:O(n)

**Where n is the length of the input string.

**Method 2: Using C++ find() and substr() APIs.

Prerequisite: **find function and **substr().**

This method is **more robust and can parse a string with any delimiter, not just spaces(though the default behavior is to separate on spaces.) The logic is pretty simple to understand from the code below.

C++ `

#include <bits/stdc++.h> using namespace std;

void tokenize(string s, string del = " ") { int start, end = -1del.size(); do { start = end + del.size(); end = s.find(del, start); cout << s.substr(start, end - start) << endl; } while (end != -1); } int main(int argc, char const argv[]) { // Takes C++ string with any separator string a = "How$%do$%you$%do$%!"; tokenize(a, "$%"); cout << endl;

return 0;

}

`

Output: How
do
you
do
!

**Time Complexity: O(n)

**Auxiliary Space:O(1)

**Where n is the length of the input string.

**Method 3: Using temporary string

If you are given that the length of the delimiter is 1, then you can simply use a temp string to split the string. This will save the function overhead time in the case of method 2.

C++ `

#include using namespace std;

void split(string str, char del){ // declaring temp string to store the curr "word" upto del string temp = "";

  for(int i=0; i<(int)str.size(); i++){
    // If cur char is not del, then append it to the cur "word", otherwise
      // you have completed the word, print it, and start a new word.
     if(str[i] != del){
        temp += str[i];
    }
      else{
        cout << temp << " ";
          temp = "";
    }
}
  
  cout << temp;

}

int main() {

string str = "geeks_for_geeks";    // string to be split
 char del = '_';    // delimiter around which string is to be split

  split(str, del);

return 0;

}

`

**Time complexity : O(n)

**Auxiliary Space: O(n)

**In Java :
In Java, split() is a method in String class.

// **expregexp is the delimiting regular expression;
// **limit is the number of returned strings
public String[] **split(String regexp, int limit);

// We can call split() without limit also
public String[] **split(String regexp)

Java `

// A Java program for splitting a string // using split() import java.io.*; public class Test { public static void main(String args[]) { String Str = new String("Geeks-for-Geeks");

    // Split above string in at-most two strings  
    for (String val: Str.split("-", 2))
        System.out.println(val);

    System.out.println("");

    // Splits Str into all possible tokens 
    for (String val: Str.split("-"))
        System.out.println(val);
}

}

`

**Output:

Geeks
for-Geeks

Geeks
for
Geeks

**Time complexity : O(n)
**Auxiliary Space: O(1)

**In Python:
The split() method in Python returns a list of strings after breaking the given string by the specified separator.

// **regexp is the delimiting regular expression;
// **limit is limit the number of splits to be made
str.**split(regexp = "", limit = string.count(str))

Python `

line = "Geek1 \nGeek2 \nGeek3" print(line.split()) print(line.split(' ', 1))

`

**Output:

['Geek1', 'Geek2', 'Geek3']
['Geek1', '\nGeek2 \nGeek3']

**Time Complexity : O(N), since it just traverse through the string finding all whitespace.

**Auxiliary Space : O(1), since no extra space has been used.

This article is contributed by **Aarti_Rathi and **Aditya Chatterjee.