Bidirectional LSTM in NLP (original) (raw)

Last Updated : 18 May, 2026

Bidirectional Long Short-Term Memory (BiLSTM) is an extension of LSTM that processes sequences in both forward and backward directions, allowing the model to capture both past and future context.

**Understanding Bidirectional LSTM (BiLSTM)

A Bidirectional LSTM (BiLSTM) consists of two separate LSTM layers:

The outputs of both LSTMs are then combined to form the final output. Mathematically, the final output at time **t is computed as:

p_t = p_{t_f} + p_{t_b}

**Where:

The following diagram represents the BiLSTM layer:

1-

Bidirectional LSTM

Here:

**Implementing Sentiment Analysis Using BiLSTM

**1. Importing Libraries

We will be using python libraries like numpy, pandas , matplotlib and tensorflow libraries for building our model.

Python `

import tensorflow as tf import tensorflow_datasets as tfds import numpy as np import matplotlib.pyplot as plt

`

**2. Loading and Preparing the IMDB Dataset

We will load IMDB dataset from tensorflow which contains 25,000 labeled movie reviews for training and testing. Shuffling ensures that the model does not learn patterns based on the order of reviews.

Python `

dataset = tfds.load('imdb_reviews', as_supervised=True) train_dataset, test_dataset = dataset['train'], dataset['test']

batch_size = 32

train_dataset = train_dataset.shuffle(10000).batch(batch_size) test_dataset = test_dataset.batch(batch_size)

`

Printing a sample review and its label from the training set.

Python `

example, label = next(iter(train_dataset)) print('Text:\n', example.numpy()[0]) print('\nLabel: ', label.numpy()[0])

`

**Output:

Text: b "Having seen men Behind the Sun ... 1 as a treatment of the subject)."
Label: 0

**3. Performing Text Vectorization

We will first perform text vectorization and let the encoder map all the words in the training dataset to a token. We can also see in the example below how we can encode and decode the sample review into a vector of integers.

vectorize_layer = tf.keras.layers.TextVectorization( output_mode='int', output_sequence_length=100)

vectorize_layer.adapt(train_dataset.map(lambda x, y: x))

`

**4. Defining Model Architecture (BiLSTM Layers)

The model uses BiLSTM layers for sentiment analysis by processing text sequences in both forward and backward directions.

model = tf.keras.Sequential([ vectorize_layer, tf.keras.layers.Embedding( len(vectorize_layer.get_vocabulary()), 64, mask_zero=True), tf.keras.layers.Bidirectional( tf.keras.layers.LSTM(64, return_sequences=True)), tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(32)), tf.keras.layers.Dense(64, activation='relu'), tf.keras.layers.Dense(1) ])

model.build(input_shape=(None,))

model.compile( loss=tf.keras.losses.BinaryCrossentropy(from_logits=True), optimizer=tf.keras.optimizers.Adam(), metrics=['accuracy'] )

model.summary()

`

**Output:

model_architecture

Defining Model Architecture (BiLSTM Layers)

5. Training the Model

Now we will train the model we defined in the previous step for three epochs.

Python `

history = model.fit( train_dataset, epochs=3, validation_data=test_dataset, )

`

**Output:

training

Training the Model

6. Prediction

Lets test our model on sample example to see its working.

Python `

review = tf.constant(["This movie was amazing and engaging"]) prob = tf.sigmoid(model.predict(review))[0][0]

sentiment = "Positive" if prob >= 0.5 else "Negative" print(f"Sentiment: {sentiment}, Probability: {prob:.2f}")

`

**Output:

Screenshot-2026-02-07-165442

Prediction

You can download source code from here.