trainYOLOv2ObjectDetector - Train YOLO v2 object detector - MATLAB (original) (raw)

Train YOLO v2 object detector

Syntax

Description

[trainedDetector](#mw%5F0ff9d26a-9a79-4333-a1b1-ba03c8539ee3%5Fsep%5Fmw%5Ffa1e6d4d-90ec-44f5-a8e8-69fb6c1a8207) = trainYOLOv2ObjectDetector([trainingData](#mw%5Fc1779815-4489-49c9-9786-05ec692442df),[detector](#mw%5F728fbd77-7d10-4e02-ac83-6a90ba114e31),[options](#mw%5Ff678a727-b7cb-4bf9-a79a-f592e2c40411)) returns an object detector trained using the you only look once version 2 (YOLO v2) network specified by detector. The options argument specifies training parameters for the detection network.

You can use this syntax for training an untrained detector or for fine-tuning a pretrained detector.

example

[trainedDetector](#mw%5F0ff9d26a-9a79-4333-a1b1-ba03c8539ee3%5Fsep%5Fmw%5Ffa1e6d4d-90ec-44f5-a8e8-69fb6c1a8207) = trainYOLOv2ObjectDetector([trainingData](#mw%5Fc1779815-4489-49c9-9786-05ec692442df),[checkpoint](#mw%5F1eecf535-37d0-44f5-a563-4b5c3469090d),[options](#mw%5Ff678a727-b7cb-4bf9-a79a-f592e2c40411)) resumes training from the saved detector checkpoint.

You can use this syntax to:

[[trainedDetector](#mw%5F0ff9d26a-9a79-4333-a1b1-ba03c8539ee3%5Fsep%5Fmw%5Ffa1e6d4d-90ec-44f5-a8e8-69fb6c1a8207),[info](#mw%5F0ff9d26a-9a79-4333-a1b1-ba03c8539ee3%5Fsep%5Fmw%5F548f86bc-a0e5-49c3-a5a3-330c0afeceeb)] = trainYOLOv2ObjectDetector(___) also returns information on the training progress, such as the training accuracy and learning rate for each iteration.

___ = trainYOLOv2ObjectDetector(___,[Name=Value](#namevaluepairarguments)) uses additional options specified by one or more name-value arguments and any of the previous inputs. For example, ExperimentMonitor=[] specifies not to track metrics with the Experiment Manager (Deep Learning Toolbox) app.

Examples

collapse all

Load the training data for vehicle detection into the workspace.

data = load("vehicleTrainingData.mat"); trainingData = data.vehicleTrainingData;

Specify the directory in which training samples are stored. Add full path to the file names in training data.

dataDir = fullfile(toolboxdir("vision"),"visiondata"); trainingData.imageFilename = fullfile(dataDir,trainingData.imageFilename);

Randomly shuffle data for training.

rng(0) shuffledIdx = randperm(height(trainingData)); trainingData = trainingData(shuffledIdx,:);

Create an imageDatastore using the files from the table.

imds = imageDatastore(trainingData.imageFilename);

Create a boxLabelDatastore using the label columns from the table.

blds = boxLabelDatastore(trainingData(:,2:end));

Combine the datastores.

Specify the class names using the label columns from the table.

classes = trainingData.Properties.VariableNames(2:end);

Specify anchor boxes.

anchorBoxes = [8 8; 32 48; 40 24; 72 48];

Load a preinitialized YOLO v2 object detection network.

load("yolov2VehicleDetectorNet.mat","net");

Create the YOLO v2 object detection network.

detector = yolov2ObjectDetector(net,classes,anchorBoxes)

detector = yolov2ObjectDetector with properties:

              Network: [1×1 dlnetwork]
            InputSize: [128 128 3]
    TrainingImageSize: [128 128]
          AnchorBoxes: [4×2 double]
           ClassNames: vehicle
ReorganizeLayerSource: ''
          LossFactors: [5 1 1 1]
            ModelName: ''

Configure the network training options.

options = trainingOptions("sgdm", ... InitialLearnRate=0.001, ... Verbose=true, ... MiniBatchSize=16, ... MaxEpochs=30, ... Shuffle="never", ... VerboseFrequency=30, ... CheckpointPath=tempdir);

Train the YOLO v2 network.

[trainedDetector,info] = trainYOLOv2ObjectDetector(ds,detector,options);


Training a YOLO v2 Object Detector for the following object classes:

Training on single CPU. |========================================================================================| | Epoch | Iteration | Time Elapsed | Mini-batch | Mini-batch | Base Learning | | | | (hh:mm:ss) | RMSE | Loss | Rate | |========================================================================================| | 1 | 1 | 00:00:01 | 7.13 | 50.8 | 0.0010 | | 2 | 30 | 00:00:15 | 1.32 | 1.8 | 0.0010 | | 4 | 60 | 00:00:32 | 0.93 | 0.9 | 0.0010 | | 5 | 90 | 00:00:46 | 0.64 | 0.4 | 0.0010 | | 7 | 120 | 00:01:00 | 0.58 | 0.3 | 0.0010 | | 9 | 150 | 00:01:14 | 0.64 | 0.4 | 0.0010 | | 10 | 180 | 00:01:27 | 0.46 | 0.2 | 0.0010 | | 12 | 210 | 00:01:42 | 0.40 | 0.2 | 0.0010 | | 14 | 240 | 00:01:58 | 0.58 | 0.3 | 0.0010 | | 15 | 270 | 00:02:17 | 0.40 | 0.2 | 0.0010 | | 17 | 300 | 00:02:35 | 0.37 | 0.1 | 0.0010 | | 19 | 330 | 00:02:48 | 0.50 | 0.2 | 0.0010 | | 20 | 360 | 00:03:03 | 0.37 | 0.1 | 0.0010 | | 22 | 390 | 00:03:17 | 0.36 | 0.1 | 0.0010 | | 24 | 420 | 00:03:30 | 0.43 | 0.2 | 0.0010 | | 25 | 450 | 00:03:44 | 0.54 | 0.3 | 0.0010 | | 27 | 480 | 00:04:00 | 0.54 | 0.3 | 0.0010 | | 29 | 510 | 00:04:18 | 0.66 | 0.4 | 0.0010 | | 30 | 540 | 00:04:34 | 0.38 | 0.1 | 0.0010 | |========================================================================================| Training finished: Max epochs completed. Detector training complete.


Verify the training accuracy by inspecting the training loss for each iteration.

figure plot(info.TrainingLoss) grid on xlabel("Number of Iterations") ylabel("Training Loss for Each Iteration")

Figure contains an axes object. The axes object with xlabel Number of Iterations, ylabel Training Loss for Each Iteration contains an object of type line.

Read a test image into the workspace.

img = imread("detectcars.png");

Run the trained YOLO v2 object detector on the test image for vehicle detection.

[bboxes,scores] = detect(trainedDetector,img);

Display the detection results.

if(~isempty(bboxes)) img = insertObjectAnnotation(img,"rectangle",bboxes,scores); end figure imshow(img)

Figure contains an axes object. The hidden axes object contains an object of type image.

Input Arguments

collapse all

Labeled ground truth images, specified as a datastore or a table.

Note

When the training data is specified using a table, thetrainYOLOv2ObjectDetector function checks these conditions

Pretrained or untrained YOLO v2 object detector, specified as a yolov2ObjectDetector object. If detector is a pretrained detector, then you can continue training the detector with additional training data or perform more training iterations to improve detector accuracy.

Training options, specified as a TrainingOptionsSGDM,TrainingOptionsRMSProp, or TrainingOptionsADAM object returned by the trainingOptions (Deep Learning Toolbox) function. To specify the solver name and other options for network training, use the trainingOptions (Deep Learning Toolbox) function.

Note

The trainYOLOv2ObjectDetector function does not support these training options:

Saved detector checkpoint, specified as a yolov2ObjectDetector object. To periodically save a detector checkpoint during training, specify CheckpointPath. To control how frequently check points are saved see the CheckPointFrequency andCheckPointFrequencyUnit training options.

To load a checkpoint for a previously trained detector, load the MAT file from the checkpoint path. For example, if the CheckpointPath property of the object specified by options is "/checkpath", you can load a checkpoint MAT file by using this code.

data = load("/checkpath/yolov2_checkpoint__216__2018_11_16__13_34_30.mat"); checkpoint = data.detector;

The name of the MAT file includes the iteration number and timestamp of when the detector checkpoint was saved. The detector is saved in the detector variable of the file. Pass this file back into thetrainYOLOv2ObjectDetector function:

yoloDetector = trainYOLOv2ObjectDetector(trainingData,checkpoint,options);

Name-Value Arguments

collapse all

Specify optional pairs of arguments asName1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Example: ExperimentManager="none" specifies not to monitor the detector training.

Before R2021a, use commas to separate each name and value, and enclose Name in quotes.

Example: "ExperimentManager","none"

Detector training experiment monitoring, specified as an experiments.Monitor (Deep Learning Toolbox) object for use with the Experiment Manager (Deep Learning Toolbox) app. You can use this object to track the progress of training, update information fields in the training results table, record values of the metrics used by the training, and to produce training plots. For an example using this app, see Train Object Detectors in Experiment Manager.

Information monitored during training:

Validation information when the training options input contains validation data:

Output Arguments

collapse all

Training progress information, returned as a structure array with seven fields. Each field corresponds to a stage of training.

Each field is a numeric vector with one element per training iteration. Values that have not been calculated at a specific iteration are assigned as NaN. The struct contains ValidationLoss,ValidationAccuracy, ValidationRMSE,FinalValidationLoss, and FinalValidationRMSE fields only when options specifies validation data.

More About

collapse all

By default, the trainYOLOv2ObjectDetector function preprocesses the training images by:

When you specify the training data by using a table, thetrainYOLOv2ObjectDetector function also augments the input dataset by:

When you specify the training data by using a datastore, thetrainYOLOv2ObjectDetector function does not perform data augmentation. Instead you can augment the training data in datastore by using the transform function and then, train the network with the augmented training data. For more information on how to apply augmentation while using datastores, see Preprocess Images for Deep Learning (Deep Learning Toolbox).

During training, the trainYOLOv2ObjectDetector function predicts refined bounding box locations by optimizing the mean squared error (MSE) loss between predicted bounding boxes and the ground truth. The loss function is defined as

K1∑i=0S2∑j=0B1ijobj[(xi−x^i)2+(yi−y^i)2] + K1∑i=0S2∑j=0B1ijobj[(wi−w^i)2+(hi−h^i)2] +K2∑i=0S2∑j=0B1ijobj(Ci−C^i)2 +K3∑i=0S2∑j=0B1ijnoobj(Ci−C^i)2 + K4∑i=0S21iobj∑c∈classes(pi(c)−p^i(c))2

where:

The loss function can be split into three parts:

Tips

References

[1] Joseph. R, S. K. Divvala, R. B. Girshick, and F. Ali. "You Only Look Once: Unified, Real-Time Object Detection." In_Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)_, pp. 779–788. Las Vegas, NV: CVPR, 2016.

[2] Joseph. R and F. Ali. "YOLO 9000: Better, Faster, Stronger." In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6517–6525. Honolulu, HI: CVPR, 2017.

Version History

Introduced in R2019a

expand all

Support for using MATLAB® Compiler™ will be removed in a future release.

The trainYOLOv2ObjectDetector supports a new process to train ayolov2ObjectDetector object. The function now uses:

The TrainingImageSize name-value argument is no longer recommended. Instead, specify training image sizes for multiscale training by using the [TrainingImageSize](yolov2objectdetector.html#mw%5Ff9479232-e734-44a4-8169-fb3ff0e4e510) name-value argument of the yolov2ObjectDetector object.

Starting in R2024b, DAGNetwork (Deep Learning Toolbox) objects are not recommended. Instead, specify the network architecture using a yolov2ObjectDetector object.

There are no plans to remove support for DAGNetwork objects.

See Also

Apps

Functions

Objects

Topics