RMSProp Optimizer in Deep Learning (original) (raw)

Last Updated : 12 May, 2026

RMSProp is an adaptive optimization algorithm that improves training speed and stability by adjusting the learning rate for each parameter based on recent gradients.

Need of RMSProp Optimizer

RMSProp was developed to overcome limitations of earlier methods like SGD and Adagrad by improving learning rate adaptation.

Working of RMSProp Optimizer

RMSProp works by maintaining a moving average of squared gradients to normalize updates and adapt the learning rate for each parameter.

**Formula:

1. Compute the gradient g_t at time step t

g t​ =∇ θ​

2. Update the moving average of squared gradients

E[g^2]_t = \gamma E[g^2]_{t-1} + (1 - \gamma)

where \gamma is the decay rate.

3. Update the parameter \theta using the adjusted learning rate

\theta_{t+1} = \theta_t - \frac{\eta}{\sqrt{E[g^2]_t + \epsilon}}

​where \eta is the learning rate and \epsilon is a small constant added for numerical stability.

Parameters Used in RMSProp

Implementing RMSprop in Python

We will use the following code line for initializing the RMSProp optimizer with hyperparameters

tf.keras.optimizers.RMSprop(learning_rate=0.001, rho=0.9)

1. Importing Libraries

We are importing libraries to implement RMSprop optimizer, handle datasets, build the model and plot results.

import tensorflow as tf from tensorflow.keras.datasets import mnist from tensorflow.keras.models import Sequential from tensorflow.keras.layers import Dense, Flatten from tensorflow.keras.utils import to_categorical import matplotlib.pyplot as plt

`

2. Loading and Preprocessing Dataset

We load the MNIST dataset, normalize pixel values to [0,1] and one-hot encode labels.

(x_train, y_train), (x_test, y_test) = mnist.load_data()

x_train = x_train.astype('float32') / 255.0 x_test = x_test.astype('float32') / 255.0 y_train = to_categorical(y_train, 10) y_test = to_categorical(y_test, 10)

`

3. Building the Model

We define a neural network using Sequential with input flattening and dense layers.

model = Sequential([ Flatten(input_shape=(28, 28)), Dense(128, activation='relu'), Dense(64, activation='relu'), Dense(10, activation='softmax') ])

`

4. Compiling the Model

We compile the model using the RMSprop optimizer for adaptive learning rates, categorical cross-entropy loss for multi-class classification and track accuracy metric.

model.compile(optimizer=tf.keras.optimizers.RMSprop(learning_rate=0.001, rho=0.9), loss='categorical_crossentropy', metrics=['accuracy'])

`

5. Training the Model

We train the model over 10 epochs with batch size 32 and validate on 20% of training data. validation_split monitors model performance on unseen data each epoch.

Python `

history = model.fit(x_train, y_train, epochs=10, batch_size=32, validation_split=0.2)

`

**Output:

training

Training the Model

6. Evaluating and Visualizing Results

We evaluate test accuracy on unseen test data and plot training and validation loss curves to visualize learning progress.

Python `

loss, accuracy = model.evaluate(x_test, y_test) print(f'Test accuracy: {accuracy:.4f}')

plt.plot(history.history['loss'], label='Training Loss') plt.plot(history.history['val_loss'], label='Validation Loss') plt.xlabel('Epochs') plt.ylabel('Loss') plt.title('Cost Function Graph') plt.legend() plt.show()

`

**Output:

cross_val

Evaluating and Visualizing Results

Advantages

Disadvantages