yolov3ObjectDetector - Detect objects using YOLO v3 object detector - MATLAB (original) (raw)
Detect objects using YOLO v3 object detector
Since R2021a
Description
The yolov3ObjectDetector
object creates a you only look once version 3 (YOLO v3) object detector for detecting objects in an image. Using this object, you can:
- Create a pretrained YOLO v3 object detector by using YOLO v3 deep learning networks trained on COCO dataset.
- Create a custom YOLO v3 object detector by using any pretrained or untrained YOLO v3 deep learning network.
Creation
Syntax
Description
Pretrained YOLO v3 Object Detector
`detector` = yolov3ObjectDetector([name](#mw%5F67dc36a8-3e8b-4ee1-99c2-fc4bd17582d3))
creates a pretrained YOLO v3 object detector by using YOLO v3 deep learning networks trained on a COCO dataset.
Note
To use the pretrained YOLO v3 deep learning networks trained on COCO dataset, you must install the Computer Vision Toolbox™ Model for YOLO v3 Object Detection from Add-On Explorer. For more information about installing add-ons, see Get and Manage Add-Ons. To run this function, you will require the Deep Learning Toolbox™.
Custom YOLO v3 Object Detector
`detector` = yolov3ObjectDetector([name](#mw%5F67dc36a8-3e8b-4ee1-99c2-fc4bd17582d3),[classes](#mw%5F391f0031-0259-4c22-8ae9-49d9ff189997),[aboxes](#mw%5F98ffa788-4a92-418d-8b03-05d80c26b493))
creates a pretrained YOLO v3 object detector and configures it to perform transfer learning using a specified set of object classes and anchor boxes. For optimal results, you must train the detector on new training images before performing detection.
`detector` = yolov3ObjectDetector([net](#mw%5F4dc8043d-5529-4f62-8e8a-79423fa4a1de),[classes](#mw%5F391f0031-0259-4c22-8ae9-49d9ff189997),[aboxes](#mw%5F98ffa788-4a92-418d-8b03-05d80c26b493))
creates an object detector by using the deep learning networknet
.
If net
is a pretrained YOLO v3 deep learning network, the function creates a pretrained YOLO v3 object detector. The classes
and aboxes
are values used for training the network.
If net
is an untrained YOLO v3 deep learning network, the function creates a YOLO v3 object detector to use for training and inference.classes
and aboxes
specify the object classes and the anchor boxes, respectively, for training the YOLO v3 network.
You must train the detector on a training dataset before performing object detection. For information about how to train a YOLO v3 object detector, see Preprocess Training Data and Train Model sections in the Object Detection Using YOLO v3 Deep Learning example.
`detector` = yolov3ObjectDetector([baseNet](#mw%5F062fd46d-3aba-44c7-b92d-3ca422605f66),[classes](#mw%5F391f0031-0259-4c22-8ae9-49d9ff189997),[aboxes](#mw%5F98ffa788-4a92-418d-8b03-05d80c26b493),DetectionNetworkSource=[layer](#mw%5F83e2aa80-28da-4457-ae0f-8ff76d5bdc5e))
creates a YOLO v3 object detector by adding detection heads to a base network,baseNet
.
The function adds detection heads to the specified feature extraction layerslayer
in the base network. To specify the names of the feature extraction layers, use the name-value argumentDetectionNetworkSource
=layer
.
If baseNet
is a pretrained deep learning network, the function creates a YOLO v3 object detector and configures it to perform transfer learning with the specified object classes and anchor boxes.
If baseNet
is an untrained deep learning network, the function creates a YOLO v3 object detector and configures it for object detection.classes
and aboxes
specify the object classes and the anchor boxes, respectively, for training the YOLO v3 network.
You must train the detector on a training dataset before performing object detection.
`detector` = yolov3ObjectDetector(___,`Name=Value`)
specifies one or more options using name-value arguments in addition to any combination of input arguments from previous syntaxes. Use this syntax to:
- Modify the detection network sources in a YOLO v3 object detection network and train the network with different numbers of object classes, anchor boxes, or both. Specify the new detection network sources using the
DetectionNetworkSource=[layer](#mw%5F83e2aa80-28da-4457-ae0f-8ff76d5bdc5e)
name-value argument. - Set the
[InputSize](yolov3objectdetector.html#mw%5Fa1a1a70e-9570-4358-9b23-2edd50d05d92)
and[ModelName](yolov3objectdetector.html#mw%5F0c7c4fdb-8ed2-49bc-8c08-69426d34e866%5Fsep%5Fmw%5F53870a22-6acd-4de8-865d-28821e6e0bc3)
properties of the object detector. For example,InputSize=[224 224 3]
sets the size of the images used for training to[224 224 3]
.
Input Arguments
Name of the pretrained YOLO v3 deep learning network, specified as one of these:
'darknet53-coco'
— A pretrained YOLO v3 deep learning network created using DarkNet-53 as the base network and trained on COCO dataset.'tiny-yolov3-coco'
— A pretrained YOLO v3 deep learning network created using a small base network and trained on COCO dataset.
Data Types: char
| string
Names of object classes for training the detector, specified as a string vector, cell array of character vectors, or categorical vector. This argument sets theClassNames
property of the yolov3ObjectDetector
object.
Data Types: char
| string
| categorical
Anchor boxes for training the detector, specified as an _N_-by-1 cell array. N is the number of output layers in the YOLO v3 deep learning network. Each cell contains an _M_-by-2 matrix, where M is the number of anchor boxes in that layer. Each cell can contain a different number of anchor boxes. Each row in the _M_-by-2 matrix denotes the size of an anchor box in the form [height _width_].
The first element in the cell array specifies the anchor boxes to associate with the first output layer, the second element in the cell array specifies the anchor boxes to associate with the second output layer, and so on. For accurate detection results, specify large anchor boxes for the first output layer and small anchor boxes for the last output layer. That is, the anchor box sizes must decrease for each output layer in the order in which the layers appear in the YOLO v3 deep learning network.
This argument sets the AnchorBoxes
property of theyolov3ObjectDetector
object.
Data Types: cell
YOLO v3 deep learning network, specified as a dlnetwork (Deep Learning Toolbox) object. The input network can be either an untrained or a pretrained deep learning network.
Base network for creating the YOLO v3 deep learning network, specified as adlnetwork (Deep Learning Toolbox) object. The network can be either an untrained or a pretrained deep learning network.
Names of the feature extraction layers in the base network, specified as a cell array of character vectors, or a string array. The function creates a YOLO v3 network by adding detection head layers to the output of the feature extraction layers in the base network.
In the pre-trained YOLO v3 base network, you must connect the detection head layers to feature extraction layers where the input image feature map changes spatial dimension by downsampling factors of 4, 8, 16, and 32.
- Use the analyzeNetwork (Deep Learning Toolbox) function to display the pre-trained YOLO v3 network architecture to obtain the information about the spatial dimensions of the feature maps.
- Use this table to know about the default feature extraction layers and choose alternate feature extraction layers supported by the pre-trained YOLO v3 networks. Choose feature extraction layers using empirical evaluation based on the detection accuracy, training speed, and object size to detect.
- For multi-scale feature extraction and to achieve higher detection accuracy, choose feature extraction layers with both higher and lower downsampling factors. Layers with lower downsampling factor capture small objects and layers with higher downsampling factor ensure a thorough feature refinement.
- To achieve higher training and inference speeds, choose feature extraction layers with lower downsampling factor.
- The number of detection network sources specified must be equal to the number of anchor boxes specified at the input.
Backbone Number of Detection Heads Default Feature Extraction Layers Examples of Alternate Feature Extraction Layers Remarks Layer Downsampling Factor Layer Downsampling Factor tiny-yolov3-coco 2 leaky_relu_7leaky_relu_5 3216 leaky_relu_6leaky_relu_3 168 Select any two feature extraction layers with different downsampling factors preceding the leaky_relu_7 layer to use as detection network sources.For example, specify the new detection network sources as layer = {'leaky_relu_6','leaky_relu_3'} darknet53-coco 3 add_23add_19add_11 32168 add_22add_21add_20 32 Select any three feature extraction layers with different downsampling factors preceding the add_23 layer to use as detection network sources.For example, specify the new detection network sources as layer = {'add_18','add_10','add_3'} add_18add_17add_16add_15add_14add_13add_12 16 add_10add_9add_8add_7add_6add_5add_4 8 add_3add_2 4
Data Types: char
| string
| cell
Properties
This property is read-only.
YOLO v3 deep learning network to use for object detection, stored as a dlnetwork (Deep Learning Toolbox) object.
This property is read-only.
Set of anchor boxes, stored as a _N_-by-1 cell array.N is the number of output layers in the YOLO v3 deep learning network for which the anchor boxes are defined. Each element in the cell is a _M_-by-2 matrix. M denotes the number of anchor boxes. Each cell can contain a different number of anchor boxes. Each row in the _M_-by-2 matrix denotes the size of the anchor box in the form of [height _width_]. The first element in the cell array specifies the anchor boxes for the first output layer, the second element in the cell array specifies the anchor boxes for the second output layer, and so on.
You can set this property by using the input argumentaboxes
.
This property is read-only.
Names of object classes to detect, stored as a categorical vector. You can set this property by using the input argument classes
.
This property is read-only.
Image size used for training, stored as a vector of form [height _width_] or [height width _channels_]. To set this property, specify it at object creation.
For example, detector = yolov3ObjectDetector(net,classes,aboxes,InputSize=[224 224 3])
.
Name for the object detector, stored as a character vector. To set this property, specify it at object creation.
For example,yolov3ObjectDetector(net,classes,aboxes,'ModelName','customDetector')
sets the name for the object detector to 'customDetector'
.
This property is read-only.
Bounding box format for an object detector, stored as"axis-aligned"
or "rotated"
. When thePredictedBoxType
is "axis-aligned"
, the object detector will train and perform inference on only axis-aligned bounding boxes. If it is set to "rotated"
, the object detector will train and perform inference on only rotated bounding boxes. Set this property when you create the object.
Object Functions
detect | Detect objects using YOLO v3 object detector |
---|---|
preprocess | Preprocess training and test images |
forward | Compute YOLO v3 deep learning network output for training |
predict | Compute YOLO v3 deep learning network outputs for inference |
Examples
Specify the name of a pretrained YOLO v3 deep learning network.
name = 'tiny-yolov3-coco';
Create YOLO v3 object detector by using the pretrained YOLO v3 network.
detector = yolov3ObjectDetector(name);
Display and inspect the properties of the YOLO v3 object detector.
yolov3ObjectDetector with properties:
Network: [1×1 dlnetwork]
AnchorBoxes: {2×1 cell}
ClassNames: [80×1 categorical]
InputSize: [416 416 3]
PredictedBoxType: 'axis-aligned'
ModelName: 'tiny-yolov3-coco'
Use analyzeNetwork
to display the YOLO v3 network architecture and get information about the network layers. The network has two detection heads attached to the feature extraction network.
analyzeNetwork(detector.Network)
Detect objects in an unknown image by using the pretrained YOLO v3 object detector.
img = imread('sherlock.jpg'); img = preprocess(detector,img); img = im2single(img); [bboxes,scores,labels] = detect(detector,img,'DetectionPreprocessing','none');
Display the detection results.
detectedImg = insertObjectAnnotation(img,'Rectangle',bboxes,labels); figure imshow(detectedImg)
This example shows how to create a custom YOLO v3 object detector by using a pretrained SqueezeNet as the base network.
Load a pretrained SqueezeNet network. The SqueezeNet network is a convolutional neural network that you can use as the base network for creating a YOLO v3 object detector.
net = imagePretrainedNetwork("squeezenet");
Inspect the architecture of the base network by using analyzeNetwork (Deep Learning Toolbox) function.
Specify the anchor boxes and the classes to use to train the YOLO v3 network.
aboxes = {[150,127;97,90;68,67];[38,42;41,29;31,23]}; classes = ["Car","Truck"];
Select two feature extraction layers in the base network to serve as the source for detection subnetwork.
layer = ["fire9-concat","fire8-concat"];
Create a custom YOLO v3 object detector by adding detection heads to the feature extraction layers of the base network. Specify the model name, classes, and the anchor boxes.
detector = yolov3ObjectDetector(net,classes,aboxes, ... ModelName="Custom YOLO v3",DetectionNetworkSource=layer);
Inspect the architecture of the YOLO v3 deep learning network by using analyzeNetwork (Deep Learning Toolbox) function.
analyzeNetwork(detector.Network)
Inspect the properties of the YOLO v3 object detector. You can now train the YOLO v3 object detector on a custom training dataset and perform object detection.
yolov3ObjectDetector with properties:
Network: [1×1 dlnetwork]
AnchorBoxes: {2×1 cell}
ClassNames: [2×1 categorical]
InputSize: [227 227 3]
PredictedBoxType: 'axis-aligned'
ModelName: 'Custom YOLO v3'
For information about how to train a YOLO v3 object detector, see the Object Detection Using YOLO v3 Deep Learning example.
Extended Capabilities
Usage notes and limitations:
- Only the detect method of the
yolov3ObjectDetector
is supported for code generation. - The
roi
argument to thedetect
method must be a code generation constant (coder.const()
) and a 1x4 vector. - Only the
Threshold
,SelectStrongest
,MinSize
, andMaxSize
name-value pairs fordetect
are supported. - To create a
yolov3ObjectDetector
object for code generation, seeLoad Pretrained Networks for Code Generation (MATLAB Coder).
Usage notes and limitations:
- Only the detect method of the
yolov3ObjectDetector
is supported for code generation. - The
roi
argument to thedetect
method must be a codegen constant (coder.const()
) and a 1x4 vector. - Only the
Threshold
,SelectStrongest
,MinSize
, andMaxSize
name-value pairs are supported. - The height, width, and channel of the input image must be fixed size.
- To create a
yolov3ObjectDetector
object for code generation, see Load Pretrained Networks for Code Generation (MATLAB Coder).
Usage notes and limitations:
- Only the detect method of the
yolov3ObjectDetector
is supported for GPU Arrays based processing. - GPU Arrays is not supported for rotated rectangle bounding box inputs.
Version History
Introduced in R2021a
Starting in R2024b, YOLO v3 object detector with PredictedBoxType
property set as "rotated"
supports both CPU and GPU code generation.
Starting in R2024b, you can modify the detection network sources of a YOLO v3 object detection network when you perform transfer learning. You can specify the new detection network sources using the name-value argumentDetectionNetworkSource=`layer`
.
- Specify new detection network sources for the
tiny-yolov3-coco
anddarknet53-coco
pretrained YOLO v3 object detectors by using the syntax
detector = yolov3ObjectDetector(name,classes,aboxes,DetectionNetworkSource=layer); - Specify new detection network sources for a YOLO v3 object detection network by using the syntax
detector = yolov3ObjectDetector(net,classes,aboxes,DetectionNetworkSource=layer);
For an example, the below code modifies the detection network sources of the pretrainedtiny-yolov3-coco
network to leaky_relu_2
andleaky_relu_5
.
name = "tiny-yolov3-coco"; layer = ["leaky_relu_2","leaky_relu_5"]; detector = yolov3ObjectDetector(name,classes,aboxes,DetectionNetworkSource=layer);
Starting in R2024a, DAGNetwork (Deep Learning Toolbox) objects are not recommended. If you specify a base network baseNet
, use adlnetwork (Deep Learning Toolbox) object instead.
There are no plans to remove support for DAGNetwork
objects.
The deep learning YOLO v3 object detector now supports rotated rectangle bounding boxes.