String hashing using Polynomial rolling hash function (original) (raw)

Try it on GfG Practice redirect icon

Given a string **str of length **n, your task is to find its hash value using polynomial rolling hash function.

**Note: If two strings are equal, their hash values should also be equal. But the inverse need not be true.

**Examples:

**Input: str = "geeksforgeeks"
**Output: 609871790

**Input: str = "polynomial"
**Output: 948934983

What is Hash Function ?

A Hash function is a function that maps any kind of data of arbitrary size to fixed-size values. The values returned by the function are called Hash Values or digests.

There are many popular Hash Functions such as DJBX33A, MD5, and SHA-256. In this article we have discussed the key features, implementation, advantages and drawbacks of the Polynomial Rolling Hash Function.

The Polynomial Rolling Hash Function

Polynomial rolling hash function is a hash function that uses only multiplications and additions. The following is the function:
\text{hash(s)} = \text{s}[0] + \text{s}[1]\cdot p + \text{s}[2]\cdot p^2 + \dots + \text{s}[n - 1]\times p^{n - 1}\quad \text{mod}\ m
or simply,
\text{hash(s)} = \displaystyle\sum_{i = 0}^{n - 1} s[i]\cdot p^i\quad \text{mod}\ m

Where

Below is the **implementation of the Polynomial Rolling Hash Function:

C++ `

#include <bits/stdc++.h> using namespace std; #define int long long

// Function to find hash of a string int findHash(string &s) { int n = s.length();

// p is a prime number
// m is a large prime number
int p = 31, m = 1e9 + 7;

// to store hash value
int hashVal = 0;

// to store p^i
int pPow = 1;

// Calculating hash value
for (int i = 0; i < n; ++i) {
    hashVal = (hashVal + (s[i] - 'a' + 1) * pPow) % m;
    pPow = (pPow * p) % m;
}
return hashVal;

}

signed main() { string s = "geeksforgeeks"; cout << findHash(s); return 0; }

Java

class GfG {

// Function to find hash of a string
static long findHash(String s) {
    int n = s.length();

    // p is a prime number
    // m is a large prime number
    long p = 31, m = (long) 1e9 + 7;

    // to store hash value
    long hashVal = 0;

    // to store p^i
    long pPow = 1;

    // Calculating hash value
    for (int i = 0; i < n; ++i) {
        hashVal = (hashVal + (s.charAt(i) - 'a' + 1) * pPow) % m;
        pPow = (pPow * p) % m;
    }
    return hashVal;
}

public static void main(String[] args) {
    String s = "geeksforgeeks";
    System.out.println(findHash(s));
}

}

Python

Function to find hash of a string

def findHash(s): n = len(s)

# p is a prime number
# m is a large prime number
p = 31
m = int(1e9 + 7)

# to store hash value
hashVal = 0

# to store p^i
pPow = 1

# Calculating hash value
for i in range(n):
    hashVal = (hashVal + (ord(s[i]) - ord('a') + 1) * pPow) % m
    pPow = (pPow * p) % m
return hashVal

s = "geeksforgeeks" print(findHash(s))

C#

using System;

class GfG {

// Function to find hash of a string
static long findHash(string s) {
    int n = s.Length;

    // p is a prime number
    // m is a large prime number
    long p = 31, m = (long)1e9 + 7;

    // to store hash value
    long hashVal = 0;

    // to store p^i
    long pPow = 1;

    // Calculating hash value
    for (int i = 0; i < n; ++i) {
        hashVal = (hashVal + (s[i] - 'a' + 1) * pPow) % m;
        pPow = (pPow * p) % m;
    }
    return hashVal;
}

public static void Main() {
    string s = "geeksforgeeks";
    Console.WriteLine(findHash(s));
}

}

JavaScript

// Function to find hash of a string function findHash(s) { let n = s.length;

// p is a prime number
// m is a large prime number
let p = 31, m = 1e9 + 7;

// to store hash value
let hashVal = 0;

// to store p^i
let pPow = 1;

// Calculating hash value
for (let i = 0; i < n; ++i) {
    hashVal = (hashVal + (s.charCodeAt(i) - 'a'.charCodeAt(0) + 1) * pPow) % m;
    pPow = (pPow * p) % m;
}
return hashVal;

}

let s = "geeksforgeeks"; console.log(findHash(s));

`

**Time Complexity: O(n)
**Auxiliary Space: O(1)

Collisions in Polynomial Rolling Hash

Since the output of the Hash function is an integer in the range [0, m) , there are high chances for two strings producing the same hash value.

For instance, the strings \text{``countermand''} and \text{``furnace''} produce the same hash value for p = 31 and m = 10^9 + 7 .

Also, the strings \text{``answers''} and \text{``stead''} produce the same hash value for p = 37 and m = 10^9 + 9 .

We can guarantee a collision within a very small domain. Consider a set of strings, S , consisting of only lower-case letters, such that the length of any string in S doesn't exceed 7 .

We have |S| = (26 + 26^2 + 26^3 + 26^4 + 26^5 + 26^6 + 26^7) = 8353082582\gt 10^9 + 7 . Since the range of the Hash Function is [0, m) , one-one mapping is impossible. Hence, we can guarantee a collision by arbitrarily generating two strings whose length doesn't exceed 7 .

Collision Resolution

We can note that the value of m affects the chances of collision. We have seen that the probability of collision is \cfrac{1}{m} . We can increase the value of m to reduce the probability of collision. But that affects the speed of the algorithm. Larger the value of m , the slower the algorithm. Also, some languages (C, C++, Java) have a limit on the size of the integer. Hence, we can't increase the value of m to a very large value.

Then how can we minimise the chances of a collision?

Note that the hash of a string depends on two parameters: p and m .

We have seen that the strings \text{``countermand''} and \text{``furnace''} produce the same hash value for p = 31 and m = 10^9 + 7 . But for p = 37 and m = 10^9 + 9 , they produce different hashes.

Observation

If two strings produce the same hash values for a pair (p1, m1) , they will produce different hashes for a different pair, (p2, m2) .

Strategy

We cannot, however, nullify the chances of collision because there are infinitely many strings. But, surely, we can reduce the probability of two strings colliding.

We can reduce the probability of collision by generating a pair of hashes for a given string. The first hash is generated using p = 31 and m = 10^9 + 7 , while the second hash is generated using p = 37 and m = 10^9 + 9 .

Why will this work?

We are generating two hashes using two different modulo values, m1 and m2 . The probability of a collision is now \cfrac{1}{m1} \times \cfrac{1}{m2} . Since both m1 and m2 are greater than 10^9 , the probability that a collision occurs is now less than \displaystyle10^{-18} which is so much better than the original probability of collision, 10^{-9} .

Below is given the **implementation:

C++ `

#include <bits/stdc++.h> using namespace std; #define int long long

// Function to find hash of a string int findHash1(string &s) { int n = s.length();

// p is a prime number
// m is a large prime number
int p = 31, m = 1e9 + 7;

// to store hash value
int hashVal = 0;

// to store p^i
int pPow = 1;

// Calculating hash value
for (int i = 0; i < n; ++i) {
    hashVal = (hashVal + (s[i] - 'a' + 1) * pPow) % m;
    pPow = (pPow * p) % m;
}
return hashVal;

}

// Function to find hash of a string int findHash2(string &s) { int n = s.length();

// p is a prime number
// m is a large prime number
int p = 37, m = 1e9 + 9;

// to store hash value
int hashVal = 0;

// to store p^i
int pPow = 1;

// Calculating hash value
for (int i = 0; i < n; ++i) {
    hashVal = (hashVal + (s[i] - 'a' + 1) * pPow) % m;
    pPow = (pPow * p) % m;
}
return hashVal;

}

signed main() { string s = "geeksforgeeks"; cout << findHash1(s) << " " << findHash2(s); return 0; }

Java

class GfG {

// Function to find hash of a string
static long findHash1(String s) {
    int n = s.length();

    // p is a prime number
    // m is a large prime number
    long p = 31, m = (long) 1e9 + 7;

    // to store hash value
    long hashVal = 0;

    // to store p^i
    long pPow = 1;

    // Calculating hash value
    for (int i = 0; i < n; ++i) {
        hashVal = (hashVal + (s.charAt(i) - 'a' + 1) * pPow) % m;
        pPow = (pPow * p) % m;
    }
    return hashVal;
}

// Function to find hash of a string
static long findHash2(String s) {
    int n = s.length();

    // p is a prime number
    // m is a large prime number
    long p = 37, m = (long) 1e9 + 9;

    // to store hash value
    long hashVal = 0;

    // to store p^i
    long pPow = 1;

    // Calculating hash value
    for (int i = 0; i < n; ++i) {
        hashVal = (hashVal + (s.charAt(i) - 'a' + 1) * pPow) % m;
        pPow = (pPow * p) % m;
    }
    return hashVal;
}

public static void main(String[] args) {
    String s = "geeksforgeeks";
    System.out.println(findHash1(s) + " " + findHash2(s));
}

}

Python

Function to find hash of a string

def findHash1(s): n = len(s)

# p is a prime number
# m is a large prime number
p = 31
m = int(1e9 + 7)

# to store hash value
hashVal = 0

# to store p^i
pPow = 1

# Calculating hash value
for i in range(n):
    hashVal = (hashVal + (ord(s[i]) - ord('a') + 1) * pPow) % m
    pPow = (pPow * p) % m
return hashVal

Function to find hash of a string

def findHash2(s): n = len(s)

# p is a prime number
# m is a large prime number
p = 37
m = int(1e9 + 9)

# to store hash value
hashVal = 0

# to store p^i
pPow = 1

# Calculating hash value
for i in range(n):
    hashVal = (hashVal + (ord(s[i]) - ord('a') + 1) * pPow) % m
    pPow = (pPow * p) % m
return hashVal

s = "geeksforgeeks" print(findHash1(s), findHash2(s))

C#

using System;

class GfG {

// Function to find hash of a string
static long findHash1(string s) {
    int n = s.Length;

    // p is a prime number
    // m is a large prime number
    long p = 31, m = (long)1e9 + 7;

    // to store hash value
    long hashVal = 0;

    // to store p^i
    long pPow = 1;

    // Calculating hash value
    for (int i = 0; i < n; ++i) {
        hashVal = (hashVal + (s[i] - 'a' + 1) * pPow) % m;
        pPow = (pPow * p) % m;
    }
    return hashVal;
}

// Function to find hash of a string
static long findHash2(string s) {
    int n = s.Length;

    // p is a prime number
    // m is a large prime number
    long p = 37, m = (long)1e9 + 9;

    // to store hash value
    long hashVal = 0;

    // to store p^i
    long pPow = 1;

    // Calculating hash value
    for (int i = 0; i < n; ++i) {
        hashVal = (hashVal + (s[i] - 'a' + 1) * pPow) % m;
        pPow = (pPow * p) % m;
    }
    return hashVal;
}

public static void Main() {
    string s = "geeksforgeeks";
    Console.WriteLine(findHash1(s) + " " + findHash2(s));
}

}

JavaScript

// Function to find hash of a string function findHash1(s) { let n = s.length;

// p is a prime number
// m is a large prime number
let p = 31, m = 1e9 + 7;

// to store hash value
let hashVal = 0;

// to store p^i
let pPow = 1;

// Calculating hash value
for (let i = 0; i < n; ++i) {
    hashVal = (hashVal + (s.charCodeAt(i) - 'a'.charCodeAt(0) + 1) * pPow) % m;
    pPow = (pPow * p) % m;
}
return hashVal;

}

// Function to find hash of a string function findHash2(s) { let n = s.length;

// p is a prime number
// m is a large prime number
let p = 37, m = 1e9 + 9;

// to store hash value
let hashVal = 0;

// to store p^i
let pPow = 1;

// Calculating hash value
for (let i = 0; i < n; ++i) {
    hashVal = (hashVal + (s.charCodeAt(i) - 'a'.charCodeAt(0) + 1) * pPow) % m;
    pPow = (pPow * p) % m;
}
return hashVal;

}

let s = "geeksforgeeks"; console.log(findHash1(s) + " " + findHash2(s));

`

Output

609871790 642799661

**Time Complexity: O(n)
**Auxiliary Space: O(1)

Features of Polynomial rolling hash function

Note that computing the hash of the string S will also compute the hashes of all of the prefixes. We just have to store the hash values of the prefixes while computing. Say \text{hash[i]} denotes the hash of the prefix \text{S[0...i]}, we have

\text{hash[i...j]}\cdot p^i = \text{hash[0...j]} - \text{hash[0...(i - 1)]}

This allows us to quickly compute the hash of the substring \text{S[i...j]} in O(1) provided we have powers of p ready.

Recall that the hash of a string s is given by

\text{hash(s)} = \displaystyle\sum_{i = 0}^{n - 1} s[i]\cdot p^i\quad \text{mod}\ m

Say, we change a character ch1 at some index i to some other character ch2 . How will the hash change?

If \text{hash\_old} denotes the hash value before changing and \text{hash\_new} is the hash value after changing, then the relation between them is given by

\text{hash\_new} = \text{hash\_old} - p^i\cdot(ch1) + p^i\cdot(ch2)

Therefore, queries can be performed very quickly instead of recalculating the hash from beginning, provided we have the powers of p ready.

Below is given the **implementation:

C++ `

#include <bits/stdc++.h> using namespace std; #define int long long

// Function to calculate power int power(int x, int y, int p) { int result = 1; for(; y; y >>= 1, x = x * x % p) { if(y & 1) { result = result * x % p; } } return result; }

// Function to calculate inverse int inverse(int x, int p) { return power(x, p - 2, p); }

// Function to precompute inverse powers void preCompute(int len, int p, int mod, vector& invPow) {

int invSize = 1;
while(invSize < len) {
    invSize <<= 1;
}

invPow.resize(invSize, -1);
invPow[invSize - 1] = 
inverse(power(p, invSize - 1, mod), mod);

for(int i = invSize - 2; i >= 0 
    && invPow[i] == -1; i--) {
    invPow[i] = (1LL * invPow[i + 1] * p) % mod;
}

}

// Function to compute hash values of a string pair<vector, vector> computeHashes( string& s, int p1, int p2, int mod1, int mod2, vector& invPow1, vector& invPow2) { int len = s.size(); vector hash1(len), hash2(len);

int h1 = 0, h2 = 0;
int pPow1 = 1, pPow2 = 1;

for(int i = 0; i < len; i++) {
    h1 = (h1 + (s[i] - 'a' + 1) * pPow1) % mod1;
    h2 = (h2 + (s[i] - 'a' + 1) * pPow2) % mod2;
    pPow1 = (pPow1 * p1) % mod1;
    pPow2 = (pPow2 * p2) % mod2;
    hash1[i] = h1;
    hash2[i] = h2;
}

preCompute(len, p1, mod1, invPow1);
preCompute(len, p2, mod2, invPow2);

return {hash1, hash2};

}

// Function to compute hash of a substring pair<int, int> getSubstringHash(vector& hash1, vector& hash2, int l, int r, vector& invPow1, vector& invPow2, int mod1, int mod2) {

if(l == 0) {
    return {hash1[r], hash2[r]};
}

int temp1 = hash1[r] - hash1[l - 1];
int temp2 = hash2[r] - hash2[l - 1];

temp1 += (temp1 < 0 ? mod1 : 0);
temp2 += (temp2 < 0 ? mod2 : 0);
temp1 = (temp1 * 1LL * invPow1[l]) % mod1;
temp2 = (temp2 * 1LL * invPow2[l]) % mod2;

return {temp1, temp2};

}

// Function to process the string and compute hashes pair<int, int> findHash(string &str) { int n = str.length();

int mod1 = 1e9 + 7, mod2 = 1e9 + 9;
int p1 = 31, p2 = 37;

vector<int> invPow1, invPow2;
pair<vector<int>, vector<int>> hashes = 
computeHashes(str, p1, p2, mod1, mod2, invPow1, invPow2);
auto hashPair = getSubstringHash(hashes.first, 
hashes.second, 0, n - 1, invPow1, invPow2, mod1, mod2);

return hashPair;

}

signed main() { string str = "geeksforgeeks"; pair<int, int> hashPair = findHash(str); cout << hashPair.first << " " << hashPair.second; return 0; }

Java

import java.util.*;

class GfG {

// Custom Pair class
static class Pair<F, S> {
    F first;
    S second;

    Pair(F first, S second) {
        this.first = first;
        this.second = second;
    }
}

// Function to calculate power
static long power(long x, long y, long p) {
    long result = 1;
    while(y > 0) {
        if((y & 1) == 1) {
            result = result * x % p;
        }
        x = x * x % p;
        y >>= 1;
    }
    return result;
}

// Function to calculate inverse
static long inverse(long x, long p) {
    return power(x, p - 2, p);
}

// Function to precompute inverse powers
static void preCompute(int len, int p, int mod, List<Long> invPow) {
    int invSize = 1;
    while(invSize < len) {
        invSize <<= 1;
    }

    while(invPow.size() < invSize) {
        invPow.add(-1L);
    }

    invPow.set(invSize - 1, inverse(power(p, invSize - 1, mod), mod));

    for(int i = invSize - 2; i >= 0 && invPow.get(i) == -1; i--) {
        invPow.set(i, (invPow.get(i + 1) * p) % mod);
    }
}

// Function to compute hash values of a string
static Pair<List<Long>, List<Long>> 
computeHashes(String s, int p1, int p2, 
int mod1, int mod2, List<Long> invPow1, List<Long> invPow2) {
    int len = s.length();
    List<Long> hash1 = new ArrayList<>(Collections.nCopies(len, 0L));
    List<Long> hash2 = new ArrayList<>(Collections.nCopies(len, 0L));

    long h1 = 0, h2 = 0;
    long pPow1 = 1, pPow2 = 1;

    for(int i = 0; i < len; i++) {
        h1 = (h1 + (s.charAt(i) - 'a' + 1) * pPow1) % mod1;
        h2 = (h2 + (s.charAt(i) - 'a' + 1) * pPow2) % mod2;
        pPow1 = (pPow1 * p1) % mod1;
        pPow2 = (pPow2 * p2) % mod2;
        hash1.set(i, h1);
        hash2.set(i, h2);
    }

    preCompute(len, p1, mod1, invPow1);
    preCompute(len, p2, mod2, invPow2);

    return new Pair<>(hash1, hash2);
}

// Function to compute hash of a substring
static Pair<Long, Long> getSubstringHash
(List<Long> hash1, List<Long> hash2, int l, int r, 
List<Long> invPow1, List<Long> invPow2, int mod1, int mod2) {

    if(l == 0) {
        return new Pair<>(hash1.get(r), hash2.get(r));
    }

    long temp1 = (hash1.get(r) - hash1.get(l - 1) + mod1) % mod1;
    long temp2 = (hash2.get(r) - hash2.get(l - 1) + mod2) % mod2;
    temp1 = (temp1 * invPow1.get(l)) % mod1;
    temp2 = (temp2 * invPow2.get(l)) % mod2;

    return new Pair<>(temp1, temp2);
}

// Function to process the string and compute hashes
static Pair<Long, Long> findHash(String str) {
    int n = str.length();

    int mod1 = (int)1e9 + 7, mod2 = (int)1e9 + 9;
    int p1 = 31, p2 = 37;

    List<Long> invPow1 = new ArrayList<>();
    List<Long> invPow2 = new ArrayList<>();
    Pair<List<Long>, List<Long>> hashes = 
    computeHashes(str, p1, p2, mod1, mod2, invPow1, invPow2);
    return getSubstringHash(hashes.first, hashes.second, 0, n - 1, invPow1, invPow2, mod1, mod2);
}

public static void main(String[] args) {
    String str = "geeksforgeeks";
    Pair<Long, Long> hashPair = findHash(str);
    System.out.println(hashPair.first + " " + hashPair.second);
}

}

Python

Function to calculate power

def power(x, y, p): result = 1 while y: if y & 1: result = result * x % p x = x * x % p y >>= 1 return result

Function to calculate inverse

def inverse(x, p): return power(x, p - 2, p)

Function to precompute inverse powers

def preCompute(length, p, mod): invSize = 1 while invSize < length: invSize <<= 1

invPow = [-1] * invSize
invPow[invSize - 1] = inverse(power(p, invSize - 1, mod), mod)

for i in range(invSize - 2, -1, -1):
    if invPow[i] == -1:
        invPow[i] = (invPow[i + 1] * p) % mod

return invPow

Function to compute hash values of a string

def computeHashes(s, p1, p2, mod1, mod2): length = len(s) hash1 = [0] * length hash2 = [0] * length

h1, h2 = 0, 0
pPow1, pPow2 = 1, 1

for i in range(length):
    h1 = (h1 + (ord(s[i]) - ord('a') + 1) * pPow1) % mod1
    h2 = (h2 + (ord(s[i]) - ord('a') + 1) * pPow2) % mod2
    pPow1 = (pPow1 * p1) % mod1
    pPow2 = (pPow2 * p2) % mod2
    hash1[i] = h1
    hash2[i] = h2

invPow1 = preCompute(length, p1, mod1)
invPow2 = preCompute(length, p2, mod2)

return hash1, hash2, invPow1, invPow2

Function to compute hash of a substring

def getSubstringHash(hash1, hash2, l, r, invPow1, invPow2, mod1, mod2): if l == 0: return hash1[r], hash2[r]

temp1 = (hash1[r] - hash1[l - 1]) % mod1
temp2 = (hash2[r] - hash2[l - 1]) % mod2
temp1 = (temp1 * invPow1[l]) % mod1
temp2 = (temp2 * invPow2[l]) % mod2

return temp1, temp2

Function to process the string and compute hashes

def findHash(str): n = len(str) mod1, mod2 = int(1e9 + 7), int(1e9 + 9) p1, p2 = 31, 37

hash1, hash2, invPow1, invPow2 = computeHashes(str, p1, p2, mod1, mod2)
return getSubstringHash(hash1, hash2, 0, n - 1, invPow1, invPow2, mod1, mod2)

Main function

str = "geeksforgeeks" hashPair = findHash(str) print(hashPair[0], hashPair[1])

C#

using System; using System.Collections.Generic;

class GfG {

// Function to calculate power
static long Power(long x, long y, long p) {
    long result = 1;
    for(; y > 0; y >>= 1, x = x * x % p) {
        if((y & 1) == 1) {
            result = result * x % p;
        }
    }
    return result;
}

// Function to calculate inverse
static long Inverse(long x, long p) {
    return Power(x, p - 2, p);
}

// Function to precompute inverse powers
static void PreCompute(int len, int p, int mod, List<long> invPow) {
    int invSize = 1;
    while(invSize < len) {
        invSize <<= 1;
    }

    while(invPow.Count < invSize) {
        invPow.Add(-1);
    }

    invPow[invSize - 1] = Inverse(Power(p, invSize - 1, mod), mod);

    for(int i = invSize - 2; i >= 0 && invPow[i] == -1; i--) {
        invPow[i] = (invPow[i + 1] * p) % mod;
    }
}

// Function to compute hash values of a string
static Tuple<List<long>, List<long>> ComputeHashes(string s, int p1, int p2, int mod1, int mod2, 
                                                   List<long> invPow1, List<long> invPow2) {
    int len = s.Length;
    List<long> hash1 = new List<long>(new long[len]);
    List<long> hash2 = new List<long>(new long[len]);

    long h1 = 0, h2 = 0;
    long pPow1 = 1, pPow2 = 1;

    for(int i = 0; i < len; i++) {
        h1 = (h1 + (s[i] - 'a' + 1) * pPow1) % mod1;
        h2 = (h2 + (s[i] - 'a' + 1) * pPow2) % mod2;
        pPow1 = (pPow1 * p1) % mod1;
        pPow2 = (pPow2 * p2) % mod2;
        hash1[i] = h1;
        hash2[i] = h2;
    }

    PreCompute(len, p1, mod1, invPow1);
    PreCompute(len, p2, mod2, invPow2);

    return new Tuple<List<long>, List<long>>(hash1, hash2);
}

// Function to compute hash of a substring
static Tuple<long, long> GetSubstringHash(List<long> hash1, List<long> hash2, int l, int r, 
                                          List<long> invPow1, List<long> invPow2, int mod1, int mod2) {

    if(l == 0) {
        return new Tuple<long, long>(hash1[r], hash2[r]);
    }

    long temp1 = (hash1[r] - hash1[l - 1] + mod1) % mod1;
    long temp2 = (hash2[r] - hash2[l - 1] + mod2) % mod2;
    temp1 = (temp1 * invPow1[l]) % mod1;
    temp2 = (temp2 * invPow2[l]) % mod2;

    return new Tuple<long, long>(temp1, temp2);
}

// Function to process the string and compute hashes
static Tuple<long, long> FindHash(string str) {
    int n = str.Length;

    int mod1 = (int)1e9 + 7, mod2 = (int)1e9 + 9;
    int p1 = 31, p2 = 37;

    List<long> invPow1 = new List<long>();
    List<long> invPow2 = new List<long>();
    var hashes = ComputeHashes(str, p1, p2, mod1, mod2, invPow1, invPow2);
    return GetSubstringHash(hashes.Item1, hashes.Item2, 0, n - 1, invPow1, invPow2, mod1, mod2);
}

public static void Main(string[] args) {
    string str = "geeksforgeeks";
    var hashPair = FindHash(str);
    Console.WriteLine(hashPair.Item1 + " " + hashPair.Item2);
}

}

JavaScript

// Function to calculate power function power(x, y, p) { let result = 1; while(y) { if(y & 1) { result = result * x % p; } x = x * x % p; y >>= 1; } return result; }

// Function to calculate inverse function inverse(x, p) { return power(x, p - 2, p); }

// Function to precompute inverse powers function preCompute(length, p, mod) { let invSize = 1; while(invSize < length) { invSize <<= 1; }

let invPow = Array(invSize).fill(-1);
invPow[invSize - 1] = inverse(power(p, invSize - 1, mod), mod);

for(let i = invSize - 2; i >= 0; i--) {
    if(invPow[i] === -1) {
        invPow[i] = (invPow[i + 1] * p) % mod;
    }
}

return invPow;

}

// Function to compute hash values of a string function computeHashes(s, p1, p2, mod1, mod2) { let length = s.length; let hash1 = new Array(length).fill(0); let hash2 = new Array(length).fill(0);

let h1 = 0, h2 = 0;
let pPow1 = 1, pPow2 = 1;

for(let i = 0; i < length; i++) {
    h1 = (h1 + (s.charCodeAt(i) - 'a'.charCodeAt(0) + 1) * pPow1) % mod1;
    h2 = (h2 + (s.charCodeAt(i) - 'a'.charCodeAt(0) + 1) * pPow2) % mod2;
    pPow1 = (pPow1 * p1) % mod1;
    pPow2 = (pPow2 * p2) % mod2;
    hash1[i] = h1;
    hash2[i] = h2;
}

let invPow1 = preCompute(length, p1, mod1);
let invPow2 = preCompute(length, p2, mod2);

return { hash1, hash2, invPow1, invPow2 };

}

// Function to compute hash of a substring function getSubstringHash(hash1, hash2, l, r, invPow1, invPow2, mod1, mod2) { if(l === 0) { return [hash1[r], hash2[r]]; }

let temp1 = (hash1[r] - hash1[l - 1] + mod1) % mod1;
let temp2 = (hash2[r] - hash2[l - 1] + mod2) % mod2;
temp1 = (temp1 * invPow1[l]) % mod1;
temp2 = (temp2 * invPow2[l]) % mod2;

return [temp1, temp2];

}

// Function to process the string and compute hashes function findHash(str) { let n = str.length;

let mod1 = 1e9 + 7, mod2 = 1e9 + 9;
let p1 = 31, p2 = 37;

let { hash1, hash2, invPow1, invPow2 } = computeHashes(str, p1, p2, mod1, mod2);
return getSubstringHash(hash1, hash2, 0, n - 1, invPow1, invPow2, mod1, mod2);

}

// Main function let str = "geeksforgeeks"; let hashPair = findHash(str); console.log(hashPair[0], hashPair[1]);

`

Output

609871790 642799661

Applications

Given a sequence S of N strings and Q queries. In each query, you are given two indices, i and j, your task is to find the length of the longest common prefix of the strings S[i] and S[j].

Before getting into the approach to solve this problem, note that the constraints are:

1\le N \le 10^5\\ 1\le Q \le 10^5\\ 1\le |S| \le 10^5\\ \text{The Sum of |S| over all test cases doesn't exceed } 10^6

Using Hashing, the problem can be solved in O(N + Q/log|S|_{max}). The approach is to compute hashes for all the strings in O(N) time, Then for each query, we can binary search the length of the longest common prefix using hashing.

Below is given the **implementation:

C++ `

#include <bits/stdc++.h> using namespace std; #define int long long

int power(int x, int y, int p) { int result = 1; for(; y; y >>= 1, x = x * x % p) { if(y & 1) { result = result * x % p; } } return result; }

int inverse(int x, int p) { return power(x, p - 2, p); }

class Hash { private: int len; int mod1 = 1e9 + 7, mod2 = 1e9 + 9; int p1 = 31, p2 = 37; vector hash1, hash2; pair<int, int> hashPair;

public: vector invPow1, invPow2; int invSize = 1;

Hash() {}

Hash(string& s) {
    len = s.size();
    hash1.resize(len);
    hash2.resize(len);

    int h1 = 0, h2 = 0;
    int pow1 = 1, pow2 = 1;
    for(int i = 0; i < len; i++) {
        h1 = (h1 + (s[i] - 'a' + 1) * pow1) % mod1;
        h2 = (h2 + (s[i] - 'a' + 1) * pow2) % mod2;
        pow1 = (pow1 * p1) % mod1;
        pow2 = (pow2 * p2) % mod2;
        hash1[i] = h1;
        hash2[i] = h2;
    }
    hashPair = make_pair(h1, h2);

    if(invSize < len) {
        for(; invSize < len; invSize <<= 1);
        
        invPow1.resize(invSize, -1);
        invPow2.resize(invSize, -1);

        invPow1[invSize - 1] = 
        inverse(power(p1, invSize - 1, mod1), mod1);
        invPow2[invSize - 1] = 
        inverse(power(p2, invSize - 1, mod2), mod2);
        
        for(int i = invSize - 2; 
            i >= 0 && invPow1[i] == -1; i--) {
            invPow1[i] = (1LL * invPow1[i + 1] * p1) % mod1;
            invPow2[i] = (1LL * invPow2[i + 1] * p2) % mod2;
        }
    }
}

int size() {
    return len;
}

pair<int, int> prefix(int index) {
    return {hash1[index], hash2[index]};
}

pair<int, int> substr(int l, int r) {
    if(l == 0) {
        return {hash1[r], hash2[r]};
    }
    int temp1 = hash1[r] - hash1[l - 1];
    int temp2 = hash2[r] - hash2[l - 1];
    temp1 += (temp1 < 0 ? mod1 : 0);
    temp2 += (temp2 < 0 ? mod2 : 0);
    temp1 = (temp1 * 1LL * invPow1[l]) % mod1;
    temp2 = (temp2 * 1LL * invPow2[l]) % mod2;
    return {temp1, temp2};
}

bool operator==(Hash& other) {
    return (hashPair == other.hashPair);
}

};

int query(vector& hashes, int n, pair<int,int> &query) { int i = query.first, j = query.second; i--, j--; int lb = 0, ub = min(hashes[i].size(), hashes[j].size()); int max_length = 0; while(lb <= ub) { int mid = (lb + ub) >> 1; if(hashes[i].prefix(mid) == hashes[j].prefix(mid)) { if(mid + 1 > max_length) { max_length = mid + 1; } lb = mid + 1; } else { ub = mid - 1; } } return max_length; }

signed main() { int n = 5, q = 4; vector strs = {"geeksforgeeks", "geeks", "hell", "geeksforpeaks", "hello"}; vector hashes; for(int i = 0; i < n; i++) { hashes.push_back(Hash(strs[i])); } vector<pair<int,int>> queries = {{1, 2}, {1, 3}, {3, 5}, {1, 4}}; for(int i = 0; i < q; i++) { cout << query(hashes, n, queries[i])<<" "; } return 0; }

Java

import java.util.; import java.io.;

public class GfG {

static long power(long x, long y, long p) {
    long result = 1;
    for(; y != 0; y >>= 1, x = x * x % p) {
        if((y & 1) != 0) {
            result = result * x % p;
        }
    }
    return result;
}

static long inverse(long x, long p) {
    return power(x, p - 2, p);
}

static class Pair {
    public long first, second;

    public Pair(long first, long second) {
        this.first = first;
        this.second = second;
    }

    public boolean equals(Object o) {
        if(this == o) return true;
        if(o == null || getClass() != o.getClass()) return false;
        Pair pair = (Pair) o;
        return first == pair.first && second == pair.second;
    }
}

static class Hash {
    private long len;
    private long mod1 = (long)1e9 + 7, mod2 = (long)1e9 + 9;
    private long p1 = 31, p2 = 37;
    private long[] hash1, hash2;
    private Pair hashPair;

    public ArrayList<Long> invPow1, invPow2;
    public long invSize = 1;
    
    public Hash() {}

    public Hash(String s) {
        len = s.length();
        hash1 = new long[(int)len];
        hash2 = new long[(int)len];

        long h1 = 0, h2 = 0;
        long pow1 = 1, pow2 = 1;
        for(int i = 0; i < len; i++) {
            h1 = (h1 + (s.charAt(i) - 'a' + 1) * pow1) % mod1;
            h2 = (h2 + (s.charAt(i) - 'a' + 1) * pow2) % mod2;
            pow1 = (pow1 * p1) % mod1;
            pow2 = (pow2 * p2) % mod2;
            hash1[i] = h1;
            hash2[i] = h2;
        }
        hashPair = new Pair(h1, h2);

        if(invSize < len) {
            while(invSize < len) {
                invSize <<= 1;
            }
            
            invPow1 = new ArrayList<Long>(Collections.nCopies((int)invSize, -1L));
            invPow2 = new ArrayList<Long>(Collections.nCopies((int)invSize, -1L));

            invPow1.set((int)invSize - 1, inverse(power(p1, invSize - 1, mod1), mod1));
            invPow2.set((int)invSize - 1, inverse(power(p2, invSize - 1, mod2), mod2));
            
            for(int i = (int)invSize - 2; i >= 0 && invPow1.get(i) == -1; i--) {
                invPow1.set(i, (invPow1.get(i + 1) * p1) % mod1);
                invPow2.set(i, (invPow2.get(i + 1) * p2) % mod2);
            }
        }
    }

    long size() {
        return len;
    }

    Pair prefix(int index) {
        return new Pair(hash1[index], hash2[index]);
    }

    Pair substr(int l, int r) {
        if(l == 0) {
            return new Pair(hash1[r], hash2[r]);
        }
        long temp1 = hash1[r] - hash1[l - 1];
        long temp2 = hash2[r] - hash2[l - 1];
        temp1 += (temp1 < 0 ? mod1 : 0);
        temp2 += (temp2 < 0 ? mod2 : 0);
        temp1 = (temp1 * invPow1.get(l)) % mod1;
        temp2 = (temp2 * invPow2.get(l)) % mod2;
        return new Pair(temp1, temp2);
    }

    public boolean equals(Hash other) {
        return (hashPair.equals(other.hashPair));
    }
}

static long query(ArrayList<Hash> hashes, long n, Pair query) {
    int i = (int)query.first, j = (int)query.second;
    i--; j--;
    int lb = 0;
    int ub = (int)Math.min(hashes.get(i).size(), hashes.get(j).size()) - 1;
    int maxLength = 0;
    while(lb <= ub) {
        int mid = (lb + ub) >> 1;
        if(hashes.get(i).prefix(mid).equals(hashes.get(j).prefix(mid))) {
            if(mid + 1 > maxLength) {
                maxLength = mid + 1;
            }
            lb = mid + 1;
        }
        else {
            ub = mid - 1;
        }
    }
    return maxLength;
}

public static void main(String[] args) throws Exception {
    long n = 5, q = 4;
    String[] strs = {"geeksforgeeks", "geeks", "hell", "geeksforpeaks", "hello"};
    ArrayList<Hash> hashes = new ArrayList<>();
    for(int i = 0; i < n; i++) {
        hashes.add(new Hash(strs[i]));
    }
    ArrayList<Pair> queries = new ArrayList<>();
    queries.add(new Pair(1, 2));
    queries.add(new Pair(1, 3));
    queries.add(new Pair(3, 5));
    queries.add(new Pair(1, 4));
    for(int i = 0; i < q; i++) {
        System.out.print(query(hashes, n, queries.get(i)) + " ");
    }
}

}

Python

from future import print_function

def power(x, y, p): result = 1 while y: if y & 1: result = result * x % p x = x * x % p y >>= 1 return result

def inverse(x, p): return power(x, p - 2, p)

class Pair: def init(self, first, second): self.first = first self.second = second def eq(self, other): return self.first == other.first and self.second == other.second

class Hash: def init(self, s=None): if s is None: return self.len = len(s) self.mod1 = int(1e9 + 7) self.mod2 = int(1e9 + 9) self.p1 = 31 self.p2 = 37 self.hash1 = [0] * self.len self.hash2 = [0] * self.len

    h1 = 0
    h2 = 0
    pow1 = 1
    pow2 = 1
    for i in range(self.len):
        h1 = (h1 + (ord(s[i]) - ord('a') + 1) * pow1) % self.mod1
        h2 = (h2 + (ord(s[i]) - ord('a') + 1) * pow2) % self.mod2
        pow1 = (pow1 * self.p1) % self.mod1
        pow2 = (pow2 * self.p2) % self.mod2
        self.hash1[i] = h1
        self.hash2[i] = h2
    self.hashPair = Pair(h1, h2)

    self.invSize = 1
    if self.invSize < self.len:
        while self.invSize < self.len:
            self.invSize <<= 1
        
        self.invPow1 = [-1] * self.invSize
        self.invPow2 = [-1] * self.invSize

        self.invPow1[self.invSize - 1] = inverse(power(self.p1, self.invSize - 1, self.mod1), self.mod1)
        self.invPow2[self.invSize - 1] = inverse(power(self.p2, self.invSize - 1, self.mod2), self.mod2)
        
        for i in range(self.invSize - 2, -1, -1):
            if self.invPow1[i] == -1:
                self.invPow1[i] = (self.invPow1[i + 1] * self.p1) % self.mod1
                self.invPow2[i] = (self.invPow2[i + 1] * self.p2) % self.mod2

def size(self):
    return self.len

def prefix(self, index):
    return Pair(self.hash1[index], self.hash2[index])

def substr(self, l, r):
    if l == 0:
        return Pair(self.hash1[r], self.hash2[r])
    temp1 = self.hash1[r] - self.hash1[l - 1]
    temp2 = self.hash2[r] - self.hash2[l - 1]
    temp1 += (self.mod1 if temp1 < 0 else 0)
    temp2 += (self.mod2 if temp2 < 0 else 0)
    temp1 = (temp1 * self.invPow1[l]) % self.mod1
    temp2 = (temp2 * self.invPow2[l]) % self.mod2
    return Pair(temp1, temp2)

def __eq__(self, other):
    return self.hashPair == other.hashPair

def query(hashes, n, query): i, j = query.first, query.second i -= 1 j -= 1 lb = 0 ub = min(hashes[i].size(), hashes[j].size()) - 1 max_length = 0 while lb <= ub: mid = (lb + ub) >> 1 if hashes[i].prefix(mid) == hashes[j].prefix(mid): if mid + 1 > max_length: max_length = mid + 1 lb = mid + 1 else: ub = mid - 1 return max_length

if name == "main": n = 5 q = 4 strs = ["geeksforgeeks", "geeks", "hell", "geeksforpeaks", "hello"] hashes = [] for i in range(n): hashes.append(Hash(strs[i])) queries = [Pair(1, 2), Pair(1, 3), Pair(3, 5), Pair(1, 4)] for i in range(q): print(query(hashes, n, queries[i]), end=" ")

C#

using System; using System.Collections.Generic;

public class Pair { public long first, second; public Pair(long first, long second) { this.first = first; this.second = second; } public override bool Equals(object obj) { if(obj == null || GetType() != obj.GetType()) return false; Pair other = (Pair)obj; return first == other.first && second == other.second; } public override int GetHashCode() { return first.GetHashCode() ^ second.GetHashCode(); } }

public class GfG {

static long power(long x, long y, long p) {
    long result = 1;
    for(; y != 0; y >>= 1, x = x * x % p) {
        if((y & 1) != 0) {
            result = result * x % p;
        }
    }
    return result;
}

static long inverse(long x, long p) {
    return power(x, p - 2, p);
}

public class Hash {
    private long len;
    private long mod1 = (long)1e9 + 7, mod2 = (long)1e9 + 9;
    private long p1 = 31, p2 = 37;
    private long[] hash1, hash2;
    private Pair hashPair;

    public List<long> invPow1, invPow2;
    public long invSize = 1;
    
    public Hash() {}

    public Hash(string s) {
        len = s.Length;
        hash1 = new long[len];
        hash2 = new long[len];

        long h1 = 0, h2 = 0;
        long pow1 = 1, pow2 = 1;
        for(int i = 0; i < len; i++) {
            h1 = (h1 + (s[i] - 'a' + 1) * pow1) % mod1;
            h2 = (h2 + (s[i] - 'a' + 1) * pow2) % mod2;
            pow1 = (pow1 * p1) % mod1;
            pow2 = (pow2 * p2) % mod2;
            hash1[i] = h1;
            hash2[i] = h2;
        }
        hashPair = new Pair(h1, h2);

        if(invSize < len) {
            while(invSize < len) {
                invSize <<= 1;
            }
            
            invPow1 = new List<long>(new long[invSize]);
            invPow2 = new List<long>(new long[invSize]);
            for(int i = 0; i < invSize; i++) {
                invPow1[i] = -1;
                invPow2[i] = -1;
            }

            invPow1[(int)invSize - 1] = inverse(power(p1, invSize - 1, mod1), mod1);
            invPow2[(int)invSize - 1] = inverse(power(p2, invSize - 1, mod2), mod2);
            
            for(int i = (int)invSize - 2; i >= 0 && invPow1[i] == -1; i--) {
                invPow1[i] = (invPow1[i + 1] * p1) % mod1;
                invPow2[i] = (invPow2[i + 1] * p2) % mod2;
            }
        }
    }

    public long size() {
        return len;
    }

    public Pair prefix(int index) {
        return new Pair(hash1[index], hash2[index]);
    }

    public Pair substr(int l, int r) {
        if(l == 0) {
            return new Pair(hash1[r], hash2[r]);
        }
        long temp1 = hash1[r] - hash1[l - 1];
        long temp2 = hash2[r] - hash2[l - 1];
        temp1 += (temp1 < 0 ? mod1 : 0);
        temp2 += (temp2 < 0 ? mod2 : 0);
        temp1 = (temp1 * invPow1[l]) % mod1;
        temp2 = (temp2 * invPow2[l]) % mod2;
        return new Pair(temp1, temp2);
    }

    public bool equals(Hash other) {
        return (hashPair.Equals(other.hashPair));
    }
}

static long query(List<Hash> hashes, long n, Pair query) {
    int i = (int)query.first, j = (int)query.second;
    i--; j--;
    int lb = 0;
    int ub = (int)Math.Min(hashes[i].size(), hashes[j].size()) - 1;
    int maxLength = 0;
    while(lb <= ub) {
        int mid = (lb + ub) >> 1;
        if(hashes[i].prefix(mid).Equals(hashes[j].prefix(mid))) {
            if(mid + 1 > maxLength) {
                maxLength = mid + 1;
            }
            lb = mid + 1;
        }
        else {
            ub = mid - 1;
        }
    }
    return maxLength;
}

public static void Main(string[] args) {
    long n = 5, q = 4;
    string[] strs = {"geeksforgeeks", "geeks", "hell", "geeksforpeaks", "hello"};
    List<Hash> hashes = new List<Hash>();
    for(int i = 0; i < n; i++) {
        hashes.Add(new Hash(strs[i]));
    }
    List<Pair> queries = new List<Pair>() { new Pair(1, 2), new Pair(1, 3), new Pair(3, 5), new Pair(1, 4) };
    for(int i = 0; i < q; i++) {
        Console.Write(query(hashes, n, queries[i]) + " ");
    }
}

}

JavaScript

function power(x, y, p) { let result = 1; for(; y !== 0; y = y >> 1, x = x * x % p) { if(y & 1) { result = result * x % p; } } return result; }

function inverse(x, p) { return power(x, p - 2, p); }

class Pair { constructor(first, second) { this.first = first; this.second = second; } equals(other) { return this.first === other.first && this.second === other.second; } }

class Hash { constructor(s) { if (s === undefined) return; this.len = s.length; this.mod1 = 1e9 + 7; this.mod2 = 1e9 + 9; this.p1 = 31; this.p2 = 37; this.hash1 = new Array(this.len).fill(0); this.hash2 = new Array(this.len).fill(0);

    let h1 = 0, h2 = 0;
    let pow1 = 1, pow2 = 1;
    for(let i = 0; i < this.len; i++) {
        h1 = (h1 + (s.charCodeAt(i) - 'a'.charCodeAt(0) + 1) * pow1) % this.mod1;
        h2 = (h2 + (s.charCodeAt(i) - 'a'.charCodeAt(0) + 1) * pow2) % this.mod2;
        pow1 = (pow1 * this.p1) % this.mod1;
        pow2 = (pow2 * this.p2) % this.mod2;
        this.hash1[i] = h1;
        this.hash2[i] = h2;
    }
    this.hashPair = new Pair(h1, h2);

    this.invSize = 1;
    if(this.invSize < this.len) {
        while(this.invSize < this.len) {
            this.invSize <<= 1;
        }
        
        this.invPow1 = new Array(this.invSize).fill(-1);
        this.invPow2 = new Array(this.invSize).fill(-1);

        this.invPow1[this.invSize - 1] = inverse(power(this.p1, this.invSize - 1, this.mod1), this.mod1);
        this.invPow2[this.invSize - 1] = inverse(power(this.p2, this.invSize - 1, this.mod2), this.mod2);
        
        for(let i = this.invSize - 2; i >= 0 && this.invPow1[i] === -1; i--) {
            this.invPow1[i] = (this.invPow1[i + 1] * this.p1) % this.mod1;
            this.invPow2[i] = (this.invPow2[i + 1] * this.p2) % this.mod2;
        }
    }
}

size() {
    return this.len;
}

prefix(index) {
    return new Pair(this.hash1[index], this.hash2[index]);
}

substr(l, r) {
    if(l === 0) {
        return new Pair(this.hash1[r], this.hash2[r]);
    }
    let temp1 = this.hash1[r] - this.hash1[l - 1];
    let temp2 = this.hash2[r] - this.hash2[l - 1];
    temp1 += (temp1 < 0 ? this.mod1 : 0);
    temp2 += (temp2 < 0 ? this.mod2 : 0);
    temp1 = (temp1 * this.invPow1[l]) % this.mod1;
    temp2 = (temp2 * this.invPow2[l]) % this.mod2;
    return new Pair(temp1, temp2);
}

}

function query(hashes, n, query) { let i = query.first, j = query.second; i--; j--; let lb = 0; let ub = Math.min(hashes[i].size(), hashes[j].size()); let maxLength = 0; while(lb <= ub) { let mid = (lb + ub) >> 1; if(hashes[i].prefix(mid).equals(hashes[j].prefix(mid))) { if(mid + 1 > maxLength) { maxLength = mid + 1; } lb = mid + 1; } else { ub = mid - 1; } } return maxLength; }

function main() { let n = 5, q = 4; let strs = ["geeksforgeeks", "geeks", "hell", "geeksforpeaks", "hello"]; let hashes = []; for(let i = 0; i < n; i++) { hashes.push(new Hash(strs[i])); } let queries = [new Pair(1, 2), new Pair(1, 3), new Pair(3, 5), new Pair(1, 4)]; let output = ""; for(let i = 0; i < q; i++) { output += query(hashes, n, queries[i]) + " "; } console.log(output); }

main();

`