logp - Log unconditional probability density for naive Bayes classifier - MATLAB (original) (raw)
Main Content
Log unconditional probability density for naive Bayes classifier
Syntax
Description
`lp` = logp([Mdl](#bue3d16%5Fsep%5Fshared-Mdl),[tbl](#bue3d16%5Fsep%5Fshared-tbl))
returns the log Unconditional Probability Density (lp
) of the observations (rows) in tbl
using the naive Bayes modelMdl
. You can use lp
to identify outliers in the training data.
`lp` = logp([Mdl](#bue3d16%5Fsep%5Fshared-Mdl),[X](#bue3d16%5Fsep%5Fshared-X))
returns the log unconditional probability density of the observations (rows) inX
using the naive Bayes modelMdl
.
Examples
Compute the unconditional probability densities of the in-sample observations of a naive Bayes classifier model.
Load the fisheriris
data set. Create X
as a numeric matrix that contains four measurements for 150 irises. Create Y
as a cell array of character vectors that contains the corresponding iris species.
load fisheriris X = meas; Y = species;
Train a naive Bayes classifier using the predictors X
and class labels Y
. A recommended practice is to specify the class names. fitcnb
assumes that each predictor is conditionally and normally distributed.
Mdl = fitcnb(X,Y,'ClassNames',{'setosa','versicolor','virginica'})
Mdl = ClassificationNaiveBayes ResponseName: 'Y' CategoricalPredictors: [] ClassNames: {'setosa' 'versicolor' 'virginica'} ScoreTransform: 'none' NumObservations: 150 DistributionNames: {'normal' 'normal' 'normal' 'normal'} DistributionParameters: {3×4 cell}
Properties, Methods
Mdl
is a trained ClassificationNaiveBayes
classifier.
Compute the unconditional probability densities of the in-sample observations.
Identify indices of observations that have very small or very large log unconditional probabilities (ind
). Display lower (L
) and upper (U
) thresholds used by the outlier detection method.
[TF,L,U] = isoutlier(lp); L
Display the values of the outlier unconditional probability densities.
ans = 4×1
-7.8995 -8.4765 -6.9854 -7.8969
All the outliers are smaller than the lower outlier detection threshold.
Plot the unconditional probability densities.
histogram(lp) hold on xline(L,'k--') hold off xlabel('Log unconditional probability') ylabel('Frequency') title('Histogram: Log Unconditional Probability')
More About
The unconditional probability density of the predictors is the density's distribution marginalized over the classes.
In other words, the unconditional probability density is
where π(Y = k) is the class prior probability. The conditional distribution of the data given the class (P(_X_1,..,XP|y = k)) and the class prior probability distributions are training options (that is, you specify them when training the classifier).
The prior probability of a class is the assumed relative frequency with which observations from that class occur in a population.
Version History
Introduced in R2014b