Subset Sum is NP Complete (original) (raw)

Last Updated : 23 Jul, 2025

**Prerequisite: NP-Completeness, Subset Sum Problem

**Subset Sum Problem: Given **N non-negative integers **a 1 ...a N and a target sum **K, the task is to decide if there is a subset having a sum equal to **K.

**Explanation: An instance of the problem is an input specified to the problem. An instance of the subset sum problem is a set S = {a 1 , ..., a N } and an integer **K. Since an NP-complete problem is a problem which is both in **NP and **NP-hard, the proof for the statement that a problem is NP-Complete consists of two parts:

The problem itself is in NP class.

All other problems in NP class can be polynomial-time reducible to that. (B is polynomial-time reducible to C is denoted as B ≤ PC)

If the **2nd condition is only satisfied then the problem is called **NP-Hard.

But it is not possible to reduce every NP problem into another NP problem to show its NP-Completeness all the time. That is why if we want to show a problem is NP-Complete we just show that the problem is in NP and any NP-Complete problem is reducible to that then we are done i.e. if B is NP-Complete and B≤PC for C in NP, then C is NP-Complete. Thus, we can verify that the **Subset Sum Problem is NP-Complete using the following two propositions:

**Subset Sum is in NP:
If any problem is in NP, then given a certificate, which is a solution to the problem and an instance of the problem (a set **S of integer **a 1 ...a Nand an integer **K) we will be able to identify (whether the solution is correct or not) certificate in polynomial time. This can be done by checking that the sum of the integers in subset **S' is equal to **K.

**Subset Sum is NP-Hard:
In order to prove Subset Sum is NP-Hard, perform a reduction from a known NP-Hard problem to this problem.
Carry out a reduction from which the **Vertex Cover Problem can be reduced to the **Subset Sum problem. Let us assume a graph G(V, E) where V = {1, 2, ..., N}. Now, for every vertex i, **a i =i. For every edge (i, j) we define a component called, **b ij.
We will represent the integers in a matrix format, where every row is expressed in the base-4 representation of the corresponding integer value of |E|+1 digits.
The matrix has the following properties:

The first column contains an integer value 1 for **a i and 0 for **b ij.
Each of the E columns starting from the right side of the matrix represents a digit for each edge. The column (i, j)=1 for **a i, **a j and **b ij, otherwise, it is equal to 0.
We define a constant k' such that,

k' = k(4^{|E|}) + \sum _{i=0}^{|E|-1}2(4^{i})

Now, the following propositions hold:

Let us consider a subset of vertices and edges to (V', E') respectively, such that

\sum _{i\epsilon V'}a_{i} + \sum _{(i, j)\epsilon E'}b_{ij} = k'

**b ij can contain at most 1 in every column. Also, the k' parameter has a 2 in all less significant digits up to |E|. We can never have a carry-in these digits. Now, these digits sum up to at-most three 1's in each column. This implies that for every edge (i, j), V' must contain either i or j. Therefore, V' becomes a vertex cover.
Let us assume there is a Vertex Cover of size k, we will choose integers **a i such that i lies in V' and all **b ij Such that either i or j is in V'. On summation of all these integers in base 4 representation(that we choose from the matrix), we get sum of integers =k'. Therefore, the chosen integers form the subset of integers with sum = k'. Therefore, subset sum holds.

Let us consider the following example,
Given is a vertex cover V = {1, 3} and k = 2

Now, a1 = 1, a2 = 2, a3 = 3, a4 = 4

The matrix can be constructed in the following way:

=> k' = k(4^{4})+\sum _{i=0}^{3}2(4^{i}) => k' = 2(4^{4})+2(4^{0})+2(4^{1})+2(4^{2})+2(4^{3}) => k' = 2(256+1+4+16+64) = 682

Now, to prove the value of k' let us choose **a i such that i lies in V', we choose a1 and a3 and **b ij such that either i or j lies in V', that is we choose **b ij such that either i or j is in V', that is we choose b12, b14, b23 and b34 from the matrix . In base 4 representation, we have the following values:

a1 = 321, a3 = 276, b12 = 64, b23 = 16, b14 = 1, b34 = 4

These values are computed using the matrix. On summation of these values, we get,

k' = 321 + 276 + 64 + 16 + 1 + 4 = 682.

Hence, k' value can be calculated and verified.

Therefore, the **Subset Sum Problem is NP-Complete.