High Performance Implementation of the Naive Bayes Algorithm (original) (raw)

Overview

The naivebayes package presents an efficient implementation of the widely-used Naïve Bayes classifier. It upholds three core principles: efficiency, user-friendliness, and reliance solely on Base R. By adhering to the latter principle, the package ensures stability and reliability without introducing external dependencies1. This design choice maintains efficiency by leveraging the optimized routines inherent in Base R, many of which are programmed in high-performance languages like C/C++ or FORTRAN. By following these principles, the naivebayes package provides a reliable and efficient tool for Naïve Bayes classification tasks, ensuring that users can perform their analyses effectively and with ease.

The [naive_bayes()](reference/naive%5Fbayes.html) function is designed to determine the class of each feature in a dataset, and depending on user specifications, it can assume various distributions for each feature. It currently supports the following class conditional distributions:

In addition to that specialized functions are available which implement:

These specialized functions are carefully optimized for efficiency, utilizing linear algebra operations to excel when handling dense matrices. Additionally, they can also exploit sparsity of matrices for enhanced performance and work in presence of missing data. The package also includes various helper functions to improve user experience. Moreover, users can access the general [naive_bayes()](reference/naive%5Fbayes.html) function through the excellent Caret package, providing additional versatility.

Installation

The naivebayes package can be installed from the CRAN repository by simply executing in the console the following line: