Gram Schmidt Process for ML (original) (raw)

Last Updated : 23 Jul, 2025

Gram-Schmidt Process is used to convert a set of vectors into an orthonormal basis. It converts a set of linearly independent vectors into a set of orthogonal vectors, which are also normalized to one unit of length.

This process is important in most fields of machine learning because it assists in enhancing numerical stability, reducing complexity in calculations and making the computation more efficient.

Orthogonality and Normalization

There are two basic concepts that must be understood before moving on to the Gram-Schmidt process: orthogonality and normalization.

  1. **Orthogonality: Two vectors are orthogonal if the dot product equals zero. This implies that they are 90 degrees to one another. In machine learning, it is convenient to work with orthogonal vectors since they make matrix operations and computation more stable.
  2. **Normalization: A vector becomes normalized if its magnitude (length) equals one. It is achieved by dividing all elements of the vector by its magnitude. Normalization ensures data does not get affected by differences in scale and stabilizes learning algorithms.

The two come together in the Gram-Schmidt process to give a group of orthonormal vectors—vectors that are both orthogonal and normalized.

The Gram-Schmidt Process Step by Step

The Gram-Schmidt process accepts a set of linearly independent vectors and converts them into an orthonormal set.

Assume we have a set of vectors:

\mathbf{v}_1, \mathbf{v}_2, \mathbf{v}_3, ..., \mathbf{v}_n

We need to convert them into a new orthonormal set of vectors:

\mathbf{u}_1, \mathbf{u}_2, \mathbf{u}_3, ..., \mathbf{u}_n

The process goes in the following manner:

**Step 1: Select the First Vector

The first orthogonal vector is just the first vector of the original set:

\mathbf{u}_1 = \mathbf{v}_1

To make it normalized, divide it by its length:

\mathbf{e}_1 = \frac{\mathbf{u}_1}{\|\mathbf{u}_1\|}

This provides us the first orthonormal vector.

**Step 2: Orthogonalize the Second Vector

To find the second orthogonal vector, remove the component of v2 that is in the direction of \mathbf{e}_1:

\mathbf{u}_2 = \mathbf{v}_2 - \text{proj}_{\mathbf{e}_1} (\mathbf{v}_2)

Here, the projection is given by:

\text{proj}_{\mathbf{e}_1} (\mathbf{v}_2) = \frac{\mathbf{v}_2 \cdot \mathbf{e}_1}{\mathbf{e}_1 \cdot \mathbf{e}_1} \mathbf{e}_1

After obtaining u_2, nomalize it:

\mathbf{e}_2 = \frac{\mathbf{u}_2}{\|\mathbf{u}_2\|}

**Step 3: Make the Third Vector Orthogonal

For the third vector, remove the components in the directions of both \mathbf{e}_1 and \mathbf{e}_2:

\mathbf{u}_3 = \mathbf{v}_3 - \text{proj}_{\mathbf{e}_1} (\mathbf{v}_3) - \text{proj}_{\mathbf{e}_2} (\mathbf{v}_3)

After obtaining \mathbf{u}_3, normalize it:

\mathbf{e}_3 = \frac{\mathbf{u}_3}{\|\mathbf{u}_3\|}

**Step 4: Repeat for All Vectors

This process is repeated for all vectors in the original set. The general formula for any vector \mathbf{u}_k is:

\mathbf{u}_k = \mathbf{v}_k - \sum_{i=1}^{k-1} \text{proj}_{\mathbf{e}_i} (\mathbf{v}_k)

After obtaining \mathbf{u}_k, normalize it:

\mathbf{e}_k = \frac{\mathbf{u}_k}{\|\mathbf{u}_k\|}

This ensures that all vectors are orthonormal.

Importance of Gram-Schmidt Process in Machine Learning

Machine learning algorithms tend to handle large datasets as matrices. Among the major reasons the Gram-Schmidt process is necessary are:

Applications of the Gram-Schmidt Process in Machine Learning

Limitations of the Gram-Schmidt Process

Despite its advantages, the Gram-Schmidt process has some limitations: