Naive Bayes Classifier in R Programming (original) (raw)

Last Updated : 28 Jun, 2025

Naive Bayes Classifier is a machine learning algorithm used to classify data into categories. It uses Bayes' Theorem to calculate the probability of each class based on the input features. It assumes that all features are independent of each other.

Bayes’ Theorem Formula

Naive Bayes algorithm is based on Bayes theorem. Bayes theorem gives the conditional probability of an event A given another event B has occurred.

P(A \mid B) = \frac{P(B \mid A) \cdot P(A)}{P(B)}

Where:

For many predictors, we can formulate the posterior probability as follows:

P(A \mid B) = P(B_1 \mid A) \cdot P(B_2 \mid A) \cdot P(B_3 \mid A) \cdot P(B_4 \mid A) \cdots

Example Using Bayes’ Theorem

Consider a sample space: {HH, HT, TH, TT}

where, H = Head, T = Tail

We are asked to find the probability that the second coin is a Head given that the first coin is a Tail.

Now applying Bayes’ Theorem:

P(A \mid B) = \frac{(1/2) \cdot (1/2)}{1/2}= \frac{1/4}{1/2}= \frac{1}{2}= 0.5

Therefore, the probability that the second coin is a Head, given that the first coin is a Tail, is 0.5.

Implementation of Naive Bayes Classifier

We follow these steps to build and evaluate a Naive Bayes model using the Iris dataset.

1. Installing and Load Required Packages

We install the necessary packages and load them.

install.packages("e1071") install.packages("caTools") install.packages("caret")

library(e1071) library(caTools) library(caret)

`

2. Loading the Dataset

We begin by loading the dataset and checking its structure.

data(iris) head(iris)

`

**Output:

Screenshot-2025-06-27-163126

Output

3. Splitting the Dataset

We split the data into training and testing sets using a 70:30 ratio.

set.seed(123) split <- sample.split(iris, SplitRatio = 0.7) train_cl <- subset(iris, split == TRUE) test_cl <- subset(iris, split == FALSE)

`

4. Scaling the Features

We scale the numerical features to normalize the data.

train_scale <- scale(train_cl[, 1:4]) test_scale <- scale(test_cl[, 1:4])

`

5. Training the Naive Bayes Model

We train the Naive Bayes classifier using the training set.

classifier_cl <- naiveBayes(Species ~ ., data = train_cl) classifier_cl

`

**Output:

Screenshot-2025-06-27-164249

Output

6. Making Predictions

We use the trained model to predict species on the test data.

y_pred <- predict(classifier_cl, newdata = test_cl)

`

7. Evaluating the Model

We create a confusion matrix and evaluate the model performance.

cm <- table(test_cl$Species, y_pred) confusionMatrix(cm)

`

**Output:

Screenshot-2025-06-27-175332

Output

The output shows that the Naive Bayes model achieved 95% accuracy, with strong performance across all classes, though some misclassifications occurred between Versicolor and Virginica.