Restricted Boltzmann Machine (original) (raw)

Last Updated : 2 Feb, 2026

A Boltzmann Machine is an unsupervised generative neural network that models data using energy-based states with fully connected bidirectional neurons. Due to this, full connectivity training is computationally expensive and inefficient in practice.

restricted_boltzmann_machine

Boltzmann Machine Vs Restricted Boltzmann Machine

A Restricted Boltzmann Machine (RBM) is a simplified version of the Boltzmann Machine designed to make training feasible. It consists of a visible layer and a hidden layer with no connections allowed between neurons within the same layer. RBMs learn latent features from unlabeled data and are widely used for representation learning and dimensionality reduction.

How RBM Works

A Restricted Boltzmann Machine (RBM) is a generative stochastic neural network consisting of two layers a visible layer and a hidden layer. The term restricted means there are no connections within the same layer only between visible and hidden units.

learning

Parameter Learning vs. Sample Generation

The image shows the two phases of a Restricted Boltzmann Machine (RBM).

RBM Architecture

v = (v_1, v_2, \dots, v_n)

h = (h_1, h_2, \dots, h_m)

Energy Function

The RBM assigns an energy to each configuration of visible and hidden units:

E(v, h) = - \sum_{i=1}^{n} b_i v_i - \sum_{j=1}^{m} c_j h_j - \sum_{i=1}^{n} \sum_{j=1}^{m} v_i W_{ij} h_j

where

Lower energy leads to higher probability.

Learning Process of Restricted Boltzmann Machine

The learning process of an RBM aims to reduce the reconstruction error. This is achieved by iteratively updating the weights so that the reconstructed data becomes closer to the original data distribution.

**1. Reconstruction Error

Reconstruction error is define by:

v^{(0)}-v^{(1)}

where

The goal of learning is to minimize this error over successive training iterations by adjusting the weights W.

**2. Forward Pass

In the forward pass, we compute the probability of activating hidden units given the visible input v^{(0)}

P(h_j = 1 \mid v^{(0)}) = \sigma\!\left( c_j + \sum_{i=1}^{n} W_{ij} v^{(0)} \right)

**3. Backward Pass

In the backward pass, the RBM reconstructs the input using the hidden activations

P(v_i = 1 \mid h) = \sigma\!\left( b_i + \sum_{j=1}^{m} W_{ij} h_j \right)

**4. Joint Probability Distribution (Gibbs Distribution)

The joint probability of a visible–hidden configuration is:

P(v, h) = \frac{1}{Z} \exp\!\left(-E(v, h)\right)

where the partition function is:

Z = \sum_{v} \sum_{h} \exp\!\left(-E(v, h)\right)

**5. Generative Learning Perspective

RBM performs reconstruction, not classification or regression.

Hence RBM is a generative model, unlike discriminative models used in classification.

**6. Error Minimization Using KL-Divergence

The difference between distributions represents the learning error and is measured using Kullback–Leibler (KL) divergence:

D_{KL}(p \parallel q) = \sum_{x} p(x) \log \frac{p(x)}{q(x)}

where

KL-divergence measures how much information is lost when q(x) approximates p(x)

**7. Weight Update Rule (Contrastive Divergence)

To reduce the KL-divergence RBM updates weights using:

\Delta W_{ij} = \eta \left( \langle v_i h_j \rangle_{\text{data}} - \langle v_i h_j \rangle_{\text{model}} \right)

where

Step By Step Implementation

In this code we train a Restricted Boltzmann Machine (RBM) on binarized MNIST images to learn feature representations then visualize reconstructed images from the RBM and generate new digit samples using Gibbs sampling

Step 1: Import Required Libraries

import numpy as np import matplotlib.pyplot as plt from tensorflow.keras.datasets import mnist

`

Step 2: Load and Preprocess MNIST Dataset

(X_train, ), (, _) = mnist.load_data() X_train = X_train.reshape(-1, 784) / 255.0 X_train = (X_train > 0.5).astype(np.float32)

`

Step 3: Define the RBM Class Structure

class RBM: def init(self, n_visible, n_hidden, lr=0.01): self.n_visible = n_visible self.n_hidden = n_hidden self.lr = lr self.W = np.random.normal(0, 0.01, (n_visible, n_hidden)) self.bv = np.zeros(n_visible) self.bh = np.zeros(n_hidden)

`

Step 4: Define Sigmoid Activation Function

`

Step 5: Sampling from Probability Distribution

`

Step 6: Forward Pass

`

Step 7: Backward Pass

`

Step 8: Train RBM using Contrastive Divergence (CD-1)

`

Step 9: Initialize and Train the RBM Model

rbm = RBM(n_visible=784, n_hidden=256, lr=0.1) rbm.train(X_train, epochs=60, batch_size=128)

`

**Output:

RBM1

RBM Traning

Step 10: Visualize Input Reconstruction

def plot_reconstruction(rbm, X, n=10): v = X[:n] _, h = rbm.forward(v) v_recon, _ = rbm.backward(h) plt.figure(figsize=(10, 4)) for i in range(n): plt.subplot(2, n, i+1) plt.imshow(v[i].reshape(28,28), cmap='gray') plt.axis('off') plt.subplot(2, n, i+n+1) plt.imshow(v_recon[i].reshape(28,28), cmap='gray') plt.axis('off') plt.show()

plot_reconstruction(rbm, X_train)

`

**Output:

RBM2

RBM MNIST reconstructions

This output shows the original MNIST images and their reconstructions generated by the RBM. It shows how well the Restricted Boltzmann Machine has learned to capture the underlying patterns of the digits.

Step 11: Generate New Samples using Gibbs Sampling

def generate_samples(rbm, steps=5000, n_samples=10): v = np.random.binomial(1, 0.5, (n_samples, rbm.n_visible)) for _ in range(steps): _, h = rbm.forward(v) _, v = rbm.backward(h) plt.figure(figsize=(10,2)) for i in range(n_samples): plt.subplot(1, n_samples, i+1) plt.imshow(v[i].reshape(28,28), cmap='gray') plt.axis('off') plt.show()

generate_samples(rbm)

`

**Output:

RBM3

RBM output samples

This output shows new digit-like images generated entirely by the RBM from random noise. After multiple Gibbs sampling steps, the RBM produces samples that resemble the patterns it learned from the training data.

You can download full code from here

Types of Restricted Boltzmann Machines

  1. **Binary Binary RBM: Standard RBM with binary visible and hidden units used for feature learning from binary or normalized data.
  2. **Gaussian Binary RBM: Continuous visible units (Gaussian) and binary hidden units suitable for real valued data like images or audio.
  3. **Bernoulli Gaussian RBM: Binary visible units and Gaussian hidden units; useful when latent features are continuous.
  4. **Softmax RBM: Handles categorical/multinomial data using softmax activations; common in NLP tasks.
  5. **Conditional RBM (CRBM): Conditions on extra context; ideal for sequential/temporal data like time-series or video.
  6. **Convolutional RBM (ConvRBM): Uses weight sharing and local receptive fields; captures spatial hierarchies in images.
  7. **Discriminative RBM: Incorporates labels for classification by modeling input features and targets together.
  8. **Deep Belief Network (DBN): Stack of RBMs for hierarchical feature learning in deep architectures.

Applications

Advantages

Limitations