logp - Log unconditional probability density for naive Bayes classifier - MATLAB (original) (raw)

Main Content

Log unconditional probability density for naive Bayes classifier

Syntax

Description

`lp` = logp([Mdl](#bue3d16%5Fsep%5Fshared-Mdl),[tbl](#bue3d16%5Fsep%5Fshared-tbl)) returns the log Unconditional Probability Density (lp) of the observations (rows) in tbl using the naive Bayes modelMdl. You can use lp to identify outliers in the training data.

`lp` = logp([Mdl](#bue3d16%5Fsep%5Fshared-Mdl),[X](#bue3d16%5Fsep%5Fshared-X)) returns the log unconditional probability density of the observations (rows) inX using the naive Bayes modelMdl.

example

Examples

collapse all

Compute the unconditional probability densities of the in-sample observations of a naive Bayes classifier model.

Load the fisheriris data set. Create X as a numeric matrix that contains four measurements for 150 irises. Create Y as a cell array of character vectors that contains the corresponding iris species.

load fisheriris X = meas; Y = species;

Train a naive Bayes classifier using the predictors X and class labels Y. A recommended practice is to specify the class names. fitcnb assumes that each predictor is conditionally and normally distributed.

Mdl = fitcnb(X,Y,'ClassNames',{'setosa','versicolor','virginica'})

Mdl = ClassificationNaiveBayes ResponseName: 'Y' CategoricalPredictors: [] ClassNames: {'setosa' 'versicolor' 'virginica'} ScoreTransform: 'none' NumObservations: 150 DistributionNames: {'normal' 'normal' 'normal' 'normal'} DistributionParameters: {3×4 cell}

Properties, Methods

Mdl is a trained ClassificationNaiveBayes classifier.

Compute the unconditional probability densities of the in-sample observations.

Identify indices of observations that have very small or very large log unconditional probabilities (ind). Display lower (L) and upper (U) thresholds used by the outlier detection method.

[TF,L,U] = isoutlier(lp); L

Display the values of the outlier unconditional probability densities.

ans = 4×1

-7.8995 -8.4765 -6.9854 -7.8969

All the outliers are smaller than the lower outlier detection threshold.

Plot the unconditional probability densities.

histogram(lp) hold on xline(L,'k--') hold off xlabel('Log unconditional probability') ylabel('Frequency') title('Histogram: Log Unconditional Probability')

Figure contains an axes object. The axes object with title Histogram: Log Unconditional Probability, xlabel Log unconditional probability, ylabel Frequency contains 2 objects of type histogram, constantline.

More About

collapse all

The unconditional probability density of the predictors is the density's distribution marginalized over the classes.

In other words, the unconditional probability density is

where π(Y = k) is the class prior probability. The conditional distribution of the data given the class (P(_X_1,..,XP|y = k)) and the class prior probability distributions are training options (that is, you specify them when training the classifier).

The prior probability of a class is the assumed relative frequency with which observations from that class occur in a population.

Version History

Introduced in R2014b