Deep Learning Metrics - MATLAB & Simulink (original) (raw)
Main Content
Use metrics to assess the performance of your deep learning model during and after training.
To specify which metrics to use during training, specify the Metrics option of the trainingOptions
function. You can use this option only when you train a network using the trainnet function.
To plot the metrics during training, in the training options, specify Plots as "training-progress"
. If you specify theValidationData training option, then the software also plots and records the metric values for the validation data. To output the metric values to the Command Window during training, in the training options, set Verbose to true
.
You can also access the metrics after training using theTrainingHistory
and ValidationHistory
fields from the second output of the trainnet
function.
To specify which metrics to use when you test a neural network, use the metrics argument of the testnet function.
You can specify metrics using their built-in name, specified as a string input to thetrainingOptions
or testnet
functions. For example, use this command.
metricValues = testnet(net,data,["accuracy","fscore"]);
If you require greater customization, then you can use metric objects and functions to specify additional options.
- If the metric has an equivalent object, then you can create the metric object with additional properties and use the metric object as input to the
trainingOptions
andtestnet
functions. - If the metric has an equivalent function, then you can specify that function as a function handle input to the
trainingOptions
andtestnet
functions.
For example, use these commands.
customAccuracy = accuracyMetric(NumTopKClasses=5,AverageType="macro"); customCrossEntropy = @(Y,T)crossentropy(X,T,Mask=customMask); metricValues = testnet(net,data,{customAccuracy,"fscore",customCrossEntropy});
If there is no object or function for metric that you need for your task, then you can create a custom metric using a function or class. For more information, see Custom Metrics.
Classification Metrics
This table compares metrics for classification tasks. The equations include these variables:
- TP, FP, TN,FN — True positives, false positives, true negatives, and false negatives
- Yi — Predicted class probabilities for observation i
- Ti — One-hot encoded target for observation i
- n — Number of observations
- N — Normalization factor
Deep Learning Classification MetricsName Description Use Case Range Equation Built-in Name Equivalent Object or Function Accuracy Proportion of correct predictions to the total number of observations Provides a general measure of performance, but it can be misleading for imbalanced data sets. 0 – 100Perfect model: 100 Accuracy = TP+TNTP+TN+FP+FN "accuracy" AccuracyMetricType: Object Precision, also known as positive predictive value (PPV) Proportion of true positive predictions among all positive predictions Focuses on minimizing false positives, making it useful in scenarios where false positives are costly, such as spam detection. 0 – 1Perfect model: 1 Precision = TPTP+FP "precision" PrecisionMetricType: Object Recall, also known as true positive rate (TPR) or sensitivity Ability of the model to correctly identify all instances of a particular class Focuses on minimizing false negatives, making it suitable for applications where false negatives are costly, such as medical diagnosis. 0 – 1Perfect model: 1 Recall = TPTP+FN "recall" RecallMetricType: Object _Fβ_-score Harmonic mean of precision and recall Balances precision and recall in a single metric. 0 – 1Perfect model: 1 Fβ=(1+β2)TP(1+β2)TP+β2FN+FP "fscore" FScoreMetricType: Object Area-under-curve (AUC) Ability of a model to distinguish between classes Useful for comparing models and evaluating performance across different classification thresholds, but it can be difficult to interpret. 0 – 1Perfect model: 1 A ROC curve shows the true positive rate (TPR) versus the false positive rate (FPR) for different thresholds of classification scores. The AUC corresponds to the integral of the curve (TPR values) with respect to FPR values from zero to one. "auc" AUCMetricType: Object Cross-entropy Difference between the true and predicted distribution of class labels for single-label classification tasks Directly related to the output of a model, but it can be difficult to interpret. Suitable for tasks where each observation is assigned exclusively to one class label. ≥ 0 Perfect model: 0 Crossentropy=−1N∑i=1nTiln(Yi) "crossentropy" crossentropy withNormalizationFactor set to"all-elements", which is then multiplied by the number of channels, andClassificationMode set to"single-label"Type: Function Binary cross-entropy Difference between the true and predicted distribution of class labels for multilabel and binary classification tasks Directly related to the output of a model, but it can be difficult to interpret. Suitable for binary classification tasks or tasks where each observation can be assigned to multiple class labels. ≥ 0 Perfect model: 0 Crossentropy=−1N∑i=1n(TjlnYj+(1−Tj)ln(1−Yj)) "binary-crossentropy" crossentropy withNormalizationFactor set to"all-elements" andClassificationMode set to"multilabel"Type: Function Index cross-entropy Difference between the true and predicted distribution of class labels, specified as integer class indices, for single-label classification tasks Directly related to the output of a model and it can save memory when dealing with many classes, but it can be difficult to interpret. Suitable for tasks where each observation is exclusively assigned one class label. ≥ 0 Perfect model: 0 Crossentropy=−1N∑i=1nT˜iln(Y˜i),where T˜i and Y˜i are the one-hot encoded targets and predictions, respectively "indexcrossentropy" indexcrossentropy withNormalizationFactor set to"target-included"Type: Function
Regression Metrics
This table compares metrics for regression tasks. The equations include these variables:
- Yi — Predicted value of observation i
- Ti — True value of observation_i_
- n — Number of observations
- N — Normalization factor
Deep Learning Regression Metrics
Name | Description | Use Case | Range | Equation | Built-in Name | Equivalent Object or Function |
---|---|---|---|---|---|---|
Root mean squared error (RMSE) | Magnitude of the errors between the predicted and true values | A general measure of model performance, expressed in the same units as the data. It can be sensitive to outliers. | ≥ 0 Perfect model: 0 | RMSE = 1N∑i=1n|Yi−Ti | 2, | "rmse" |
Mean absolute percentage error (MAPE) | Percentage magnitude of the errors between the predicted and true values | Returns a percentage, making it is an intuitive performance measure that is easy to compare across models, though it may perform poorly when target values are near zero. | ≥ 0 Perfect model: 0 | MAPE =1N∑i=1n|Ti−YiTi | , | "mape" |
R2, also known as the coefficient of determination | Measure of how well the predictions explain the variance in the true values | A unitless measure of performance that is easy to compare across different models and data sets. | ≤ 1 Perfect model: 1 | R2=1−∑i=1n(Yi−Ti)2∑i=1n(Ti−T¯)2, where T¯=1n∑i=1nTi | "rsquared" | RSquaredMetricType: Object |
Mean absolute error (MAE), also known as L1 loss | Magnitude of the errors between the predicted and true values | Provides an understanding of the average error. It is robust to outliers and expressed in the same units as the data. | ≥ 0 Perfect model: 0 | MAE=1N∑i=1n|Yi−Ti | "mae" / "mean-absolute-error" / "l1loss" | |
Mean squared error (MSE), also known as L2 loss | Squared difference between the predicted and true values | A general measure of model performance that penalizes outliers more, making it suitable for applications where outliers are costly. | ≥ 0 Perfect model: 0 | MSE=1N∑i=1n(Yi−Ti)2 | "mse" / "mean-squared-error" / "l2loss" | l2loss with NormalizationFactor set to "all-elements"Type: Function |
Huber | Combination of MSE and MAE | Balances sensitivity to outliers with robust error measurement, making it suitable for data sets with some outliers. | ≥ 0 Perfect model: 0 | Huberi={12(Yi−Ti)2if |Yi−Ti | ≤1 | Yi−Ti |
Custom Metrics
If Deep Learning Toolbox™ does not provide the metric that you need for your task, then in many cases you can create a custom metric using a function. After you define the metric function, you can specify the metric as the Metrics name-value argument in the trainingOptions function. For more information, see Define Custom Metric Function.
Early stopping and returning the best network is not supported for custom metric functions. If you require early stopping or retuning the best network, then you must create a custom metric object instead. For more information, see Define Custom Deep Learning Metric Object.
See Also
trainnet | testnet | trainingOptions | dlnetwork | accuracyMetric | aucMetric | fScoreMetric | precisionMetric | recallMetric | rmseMetric | mapeMetric | rSquaredMetric