QR Decomposition in Machine learning (original) (raw)

Last Updated : 23 Jul, 2025

**QR decomposition is a way of expressing a matrix as the product of two matrices: **Q (an orthogonal matrix) and **R (an upper triangular matrix). In this article, I will explain decomposition in Linear Algebra, particularly QR decomposition among many decompositions.

What is QR Decomposition?

Decomposition or Factorization is dividing the original single entity into multiple entities for easiness. Decomposition has various applications in numerical linear algebra, optimization, solving systems of linear equations, etc. QR decomposition is a versatile tool in numerical linear algebra that finds applications in solving linear systems, least squares problems, eigenvalue computations, etc. Its numerical stability and efficiency make it a valuable technique in a range of applications.

**QR decomposition, also known as QR factorization, is a fundamental matrix decomposition technique in linear algebra. QR decomposition is a matrix factorization technique that decomposes a matrix into the product of an orthogonal matrix (Q) and an upper triangular matrix (R). Given a matrix A (m x n), where m is the number of rows and n is the number of columns, the QR decomposition can be expressed as:

A= QR

QR Decomposition in Machine learning

QR decomposition finds widespread use in machine learning for tasks like solving linear regression, eigenvalue problems, Gram-Schmidt orthogonalization, handling overdetermined systems, matrix inversion, Gram matrix factorization, and enhancing numerical stability in various algorithms. More details about it, is in the application section.

  1. **Matrix Factorization: Matrix factorization involves expressing a matrix as the product of two or more matrices. In QR decomposition, we express a given matrix A as the product of an orthogonal matrix Q and an upper triangular matrix R.
  2. **Orthogonal Matrix: An orthogonal matrix Q has the property that its transpose is equal to its inverse (Q^T * Q = I, where I is the identity matrix).
    • Properties: Orthogonal matrices preserve the length of vectors and the dot product. They play a crucial role in QR decomposition.
  3. **Upper Triangular Matrix: A matrix is upper triangular if all entries below the main diagonal are zero. In QR decomposition, R is an upper triangular matrix.
  4. **Gram-Schmidt Process (Orthogonalization Process): The Gram-Schmidt process is used to orthogonalize a set of vectors. In the context of QR decomposition, it is applied to the columns of the original matrix to construct an orthogonal matrix Q.

Compute QR decomposition:

Gram-Schmidt Orthogonalization

The Gram-Schmidt process is often used to orthogonalize the columns of the matrix A. It produces an orthogonal matrix Q.

Given a matrix A,
A = \begin{bmatrix} a_{11} & \cdots & a_{n1}\\ \vdots & \ddots & \vdots\\ a_{1m} & \cdots & a_{nm} \end{bmatrix} \\ A = [a_1, a_2,..., a_n],
where, ai is columns of A:

Triangularization

Once Q is obtained, the upper triangular matrix R is obtained by multiplying Q^Twith the original matrix A.

R = Q^T .A

The orthogonal matrix Q is used to triangularize the original matrix A, resulting in an upper triangular matrix R.

Result:

A = QR,

Here,

Orthogonal Matrix Property:

Q^T .Q = I

here,

Step by step Implementations

Using Gram-Schmidt Process:

First, perform normalization.

q1 = \frac{1}{||a_1||} . a_1

Here, ||a_1|| denotes the norm of a_1

Then, we project a2 on q1:

a_2 = <a_2. q_1>q_1 + q_{2}^{'} \\ q_{2}^{'}=a_2 - <a_2. q_1>q_1

Here,

After this project, we normalize the residuals:

q_2 = \frac{1}{||q_{2}^{'}||}. q_{2}^{'}

Then, we project a3 on q1 and q2 :

a_3 = <a_3.q_1>q_1 + <a_3.q_2>q_2 + q_{3}^{'} \\ q_{3}^{'}=a_3 - <a_3.q_1>q_1 - <a_3.q_2>q_2

Here,

We repeatedly perform alternating steps of normalization, where projection residuals are divided by their norms, and projection steps, where a1 is projected according to (q_1, q_2, ..., q_{l-1}), until a set of orthonormal vectors is obtained as q_1, ..., q_L.

Residuals are expressed in terms of normalized vectors as:

q_l^{'} = ||q_l^{'}||.q_l

for l =1, ..., L , we define q_1^{'} = a_1

Therefore, we can write the projections as:

a_1 = <a_1.q_1>.q_1 + ... + <a_l.q_{l-1}>q_{l-1} + ||q_{l}^{'}||q_l

Then, we form a matrix using the orthogonal vectors:

Q = [q_1 ... q_L]

For computing R matrix, we will form an upper triangular square matrix:

R = \begin{bmatrix} ||q_1'|| & <a_2.q_1> & <a_3.q_1> & \cdots & <a_L.q_1> \\ 0& ||q_2'|| & <a_3.q_2> & \cdots & <a_L.q_2>\\ 0& 0 & ||q_3'|| & & \vdots \\ \vdots & \vdots & & \ddots & <a_L. q_{L-1}> \\ 0& 0 & \cdots & 0 & ||q_L'|| \end{bmatrix}

If, we compute Q and R, we will get the matrix.

QR Decomposition using Python

Python3 `

import numpy as np

Create a numpy array

arr = np.array([[1, 2, 4], [0, 0, 5], [0, 3, 6]])

print(arr)

Find the QR factor of array

q, r = np.linalg.qr(arr) print('\nQ:\n', q) print('\nR:\n', r) print(np.allclose(arr, np.dot(q, r))) # to check result is correct or not

`

**Output:

[[1 2 4]
[0 0 5]
[0 3 6]]
Q:
[[ 1. 0. 0.]
[ 0. 0. -1.]
[ 0. -1. 0.]]
R:
[[ 1. 2. 4.]
[ 0. -3. -6.]
[ 0. 0. -5.]]
True

Mathematical explantions

Let's understand the QR Decomposition process by

Suppose we are provided with the matrix A:

A = \begin{bmatrix} 1 & 2 & 4 \\ 0 & 0 & 5 \\ 0 & 3 & 6 \end{bmatrix}

As mentioned in the steps before, we will be using Gram-Schmidt Orthogonalization.

We will be finding orthogonal components q1 , q2 and q3 :

First, perform normalization and we get the first normalized vector:

q1 = \frac{a1}{\|a1\|} = \begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix}

The norm of the first column is calculated as:

||a_1|| = \sqrt{1^2 +0^2 +0^2} = 1

The inner product of between a2 and q1 is <a_2. q_1> = q^T. a_2 is considered and the projection of the second column on q_1 is multiplied with the inner product.

q_2^{'} is the residual of the projection:

q2' = a2 - <a2 . q1> q1 \\[10pt] \hspace{0.55cm} = \begin{bmatrix} 2 \\ 0 \\ 3 \end{bmatrix} - 2 * \begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix} \\[10pt] \hspace{0.55cm} = \begin{bmatrix} 0 \\ 0 \\ 3 \end{bmatrix} \\[10pt]

Now, we will normalize the residual:

q2 = \frac{q2'}{\|q2'\|} = \frac{q2'}{3} = \begin{bmatrix} 0 \\ 0 \\ 1 \end{bmatrix}

Now, we will project a3 on q1 and q2 :

q3' = a3 - <a3 , q1> q1 - <a3, q2> q2 \\[10pt] \hspace{0.55cm} = \begin{bmatrix} 4 \\ 5 \\ 6 \end{bmatrix} - 4 * \begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix} - 6 * \begin{bmatrix} 0 \\ 0 \\ 1 \end{bmatrix} \\[10pt] \hspace{0.55cm} = \begin{bmatrix} 0 \\ 5 \\ 0 \end{bmatrix} \\[10pt]

Now, we will normalize the residual. :

q3 = \frac{q3'}{\|q3'\|} = \frac{q3'}{5} = \begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix}

We got Q matrix.

Q = \begin{bmatrix} 1& 0 &0 \\ 0& 0& 1\\ 0& 1& 0 \end{bmatrix}

The given R is an upper triangular matrix.

R = \begin{bmatrix} 1 & 2 & 3 \\ 0 & 3 & 6 \\ 0 & 0 & 5 \end{bmatrix}

Mathematical Calculation (Q = [q1 q2 q3], so A = QR) value is different compared to python Numpy package. Reason described below.

Reason for difference of NumPy results and our calculation from steps:

The QR decomposition is not unique all the way down to the signs. One can flip signs in Q as long as you flip the corresponding signs in R. Some implementations enforce positive diagonals in R, but this is just a convention. Since NumPy defer to LAPACK for these linear algebra operations, we follow its conventions, which do not enforce such a requirement.

Applications:

It has many applications in algebra and machine learning whether it is for least square method, linear regression, PCA, eigenvalue problem or regularization of model in machine learning. Few of them are written below.

Advantages

Disadvantage: