segmentObjects - Segment objects using SOLOv2 instance segmentation - MATLAB (original) (raw)
Segment objects using SOLOv2 instance segmentation
Since R2023b
Syntax
Description
[masks](#mw%5Facea3e30-6c29-4505-847f-4f69be99550f%5Fsep%5Fmw%5F3f6f4b8b-0bd4-44a3-aafa-522d6c430d5c) = segmentObjects([detector](#mw%5Facea3e30-6c29-4505-847f-4f69be99550f%5Fsep%5Fmw%5F9ae4ef44-6a15-4603-aac7-56208411e207),[I](#mw%5Facea3e30-6c29-4505-847f-4f69be99550f%5Fsep%5Fmw%5Fc42f62bd-bea4-474a-96a3-f99843001dde))
segments objects within a single image or array of images I
using SOLOv2 instance segmentation, and returns the predicted object masks for the input image or images.
Note
This functionality requires Deep Learning Toolbox™ and the Computer Vision Toolbox™ Model for SOLOv2 Instance Segmentation. You can install the Computer Vision Toolbox Model for SOLOv2 Instance Segmentation from Add-On Explorer. For more information about installing add-ons, see Get and Manage Add-Ons.
[[masks](#mw%5Facea3e30-6c29-4505-847f-4f69be99550f%5Fsep%5Fmw%5F3f6f4b8b-0bd4-44a3-aafa-522d6c430d5c),[labels](#mw%5Facea3e30-6c29-4505-847f-4f69be99550f%5Fsep%5Fmw%5F56f8c7ba-f1e4-47cf-bb0c-f1f40d965da3)] = segmentObjects([detector](#mw%5Facea3e30-6c29-4505-847f-4f69be99550f%5Fsep%5Fmw%5F9ae4ef44-6a15-4603-aac7-56208411e207),[I](#mw%5Facea3e30-6c29-4505-847f-4f69be99550f%5Fsep%5Fmw%5Fc42f62bd-bea4-474a-96a3-f99843001dde))
also returns the labels assigned to the predicted object instance masks.
[[masks](#mw%5Facea3e30-6c29-4505-847f-4f69be99550f%5Fsep%5Fmw%5F3f6f4b8b-0bd4-44a3-aafa-522d6c430d5c),[labels](#mw%5Facea3e30-6c29-4505-847f-4f69be99550f%5Fsep%5Fmw%5F56f8c7ba-f1e4-47cf-bb0c-f1f40d965da3),[scores](#mw%5Facea3e30-6c29-4505-847f-4f69be99550f%5Fsep%5Fmw%5F223e2c94-1f5b-4753-9d06-fd1bb78f705c)] = segmentObjects([detector](#mw%5Facea3e30-6c29-4505-847f-4f69be99550f%5Fsep%5Fmw%5F9ae4ef44-6a15-4603-aac7-56208411e207),[I](#mw%5Facea3e30-6c29-4505-847f-4f69be99550f%5Fsep%5Fmw%5Fc42f62bd-bea4-474a-96a3-f99843001dde))
also returns the prediction score for each predicted object instance mask.
[dsResults](#mw%5Facea3e30-6c29-4505-847f-4f69be99550f%5Fsep%5Fmw%5F08ef1f52-77aa-4c97-a740-ff01d33032c3) = segmentObjects([detector](#mw%5Facea3e30-6c29-4505-847f-4f69be99550f%5Fsep%5Fmw%5F9ae4ef44-6a15-4603-aac7-56208411e207),[imds](#mw%5Facea3e30-6c29-4505-847f-4f69be99550f%5Fsep%5Fmw%5F570b9681-5a6c-408a-81b4-4306ae92959a))
segments objects within images in a datastore using SOLOv2 instance segmentation. The function returns a datastore with the instance segmentation results, including the instance masks, labels, prediction scores, and bounding boxes.
[___] = segmentObjects(___,[Name=Value](#namevaluepairarguments))
specifies options using additional name-value arguments in addition to any combination of arguments from previous syntaxes.. For example, Threshold=0.9
specifies the confidence threshold as 0.9
.
Examples
Create a pretrained SOLOv2 instance segmentation network.
model = solov2("light-resnet18-coco");
Read a test image that includes objects that the network can detect, such as dogs, into the workspace.
Segment instances of objects in the image using the SOLOv2 instance segmentation model.
[masks,labels,scores] = segmentObjects(model,I);
Display the instance segmentation results. Overlay the detected object instance mask on the test image.
overlayedImage = insertObjectMask(I,masks); imshow(overlayedImage)
Load a pretrained SOLOv2 instance segmentation network.
model = solov2("resnet50-coco");
Create a datastore of test images.
imageFiles = fullfile(toolboxdir("vision"),"visiondata","visionteam*.jpg"); dsTest = imageDatastore(imageFiles);
Segment instances of objects using the SOLOv2 instance segmentation model.
dsResults = segmentObjects(model,dsTest,Threshold=0.55);
Running SoloV2 network
- Processed 2 images.
For each test image, display the instance segmentation results. Overlay the detected object masks on the test image.
while(hasdata(dsResults)) testImage = read(dsTest); results = read(dsResults); maskColors = lines(numel(results{2})); figure overlayedImage = insertObjectMask(testImage,results{1},Color=maskColors); imshow(overlayedImage) end
Input Arguments
SOLOv2 instance segmentation model, specified as a solov2 object.
Image or batch of images on which to perform instance segmentation, specified as one of these values.
Image Type | Data Format |
---|---|
Single grayscale image | 2-D matrix of size _H_-by-W |
Single color image | 3-D array of size_H_-by-_W_-by-3. |
Batch of B grayscale or color images | 4-D array of size_H_-by-_W_-by-_C_-by-B. The number of color channels C is 1 for grayscale images and 3 for color images. |
The height H and width W of each image must be greater than or equal to the input height h and width_w_ of the network.
Datastore of images, specified as a datastore such as an ImageDatastore or CombinedDatastore object. If calling the datastore with the read function returns a cell array, then the image data must be in the first cell.
Name-Value Arguments
Specify optional pairs of arguments asName1=Value1,...,NameN=ValueN
, where Name
is the argument name and Value
is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.
Example: segmentObjects(detector,I,Threshold=0.9)
specifies the confidence threshold as 0.9
.
Options for All Image Formats
Confidence threshold, specified as a numeric scalar in the range [0, 1]
. The segmentObjects
function filters out predictions with confidence scores less than the threshold value. Increase this value to reduce the number of false positives, at the possible expense of missing some true positives.
Mask probability threshold, specified as a numeric scalar in the range[0, 1]
. The mask probability threshold is the threshold value for the mask probabilities, determined by an output activation function, that separate object mask pixels from background pixels. If the threshold is too high, the function might incorrectly classify some foreground object pixels as background pixels, reducing the accuracy of the segmentation.
Select the strongest mask prediction for each segmented object instance using non-maximum suppression, specified as a numeric or logical 1
(true
) or 0
(false
).
true
— Return the strongest object mask prediction per object. ThesegmentObjects
function selects these predictions by using non-maximum suppression to eliminate overlapping bounding boxes based on their confidence scores.false
— Return all predictions. You can then create a custom operation to eliminate overlapping object masks.
Network acceleration type to use for performance optimization, specified as one of these options:
"auto"
— Automatically select optimizations suitable for the input network and environment."mex"
— Compile and execute a MEX function. This option is available when using a GPU only. Using a GPU requires Parallel Computing Toolbox™ and a CUDA® enabled NVIDIA® GPU. If Parallel Computing Toolbox or a suitable GPU is not available, then the function returns an error. For information about the supported compute capabilities, see GPU Computing Requirements (Parallel Computing Toolbox)."none"
— Disable all acceleration.
Use network acceleration to improve performance when using the same instance segmentation network and segmentation parameters across multiple image inputs, at the expense of additional overhead on the initial function call, and a possible increase in memory usage.
Hardware resource on which to process images with the network, specified as one of the execution environment options in this table.
ExecutionEnvironment | Description |
---|---|
"auto" | Use a GPU if available. Otherwise, use the CPU. The use of a GPU requires Parallel Computing Toolbox and a CUDA enabled NVIDIA GPU. For information about the supported compute capabilities, see GPU Computing Requirements (Parallel Computing Toolbox). |
"gpu" | Use the GPU. If a suitable GPU is not available, the function returns an error message. Using a GPU requires Parallel Computing Toolbox and a CUDA enabled NVIDIA GPU. If Parallel Computing Toolbox or a suitable GPU is not available, then the function returns an error. For information about the supported compute capabilities, seeGPU Computing Requirements (Parallel Computing Toolbox). |
"cpu" | Use the CPU. |
Maximum number of convolution kernels, specified as a positive integer in the range [1, 3096]. This value sets the maximum number of convolution kernels, or filters, that the SOLOv2 network uses to perform a convolution operation for producing segmentation masks.
Specify the value of MaxNumKernels
only when performing code generation. Otherwise, use the default value. The default value,"auto"
, sets the maximum number of kernels depending on the content of the image, based on the number of kernels with an acceptable confidence threshold.
Tip
Determine the optimal value of MaxKernelSize
for your application by using the evaluateInstanceSegmentation function to evaluate the network performance at different MaxKernelSize
values. Increase the value of MaxKernelSize
to increase instance segmentation accuracy at the expense of slower inference speed.
Options for Datastore Inputs
Number of observations returned in each batch, specified as a positive integer. If you set a higher MiniBatchSize
, segmentation requires more memory, which can cause errors if your system does not have sufficient memory.
You can specify this argument only when you specify a batch of images,I, or datastore of images,imds, as an input to the segmentObjects
function.
Location to store writable data, specified as a string scalar or character vector. The specified folder must have write permissions. If the folder already exists, thesegmentObjects
function creates a new folder and adds a suffix to the folder name with the next available number. The default write location isfullfile(pwd,"SegmentObjectResults")
, wherepwd
is the current working directory.
You can specify this argument only when you specify a datastore of images,imds, as an input to the segmentObjects
function.
Data Types: char
| string
Prefix added to written filenames, specified as a string scalar or character vector. The function names the output files_NamePrefiximageName.mat_
, where_imageName_
is the name of the input image without its file extension.
You can specify this argument only when you specify a datastore of images,imds.
Data Types: char
| string
Visible progress display, specified as a numeric or logical 1
(true
) or 0
(false
).
You can specify this argument only when you specify a datastore of images,imds.
Output Arguments
Object masks, returned as an_H_-by-W_-by-M logical array for a single image or a B_-by-1 cell array for a batch of_B images. H and W are the height and width, respectively, of the input image I, and_M is the number of objects masks predicted in the image. Each of the M channels contains the mask for a single predicted object instance.
For a batch of B images, each cell of the_B_-by-1 cell array contains an_H_-by-_W_-by-M array of object masks the corresponding image from the batch.
Objects labels, returned as an _M_-by-1 categorical vector for a single image or a B_-by-1 cell array for a batch of_B images. M is the number of predicted object instances in the input image I.
For a batch of B images, each cell of the_B_-by-1 cell array contains an _M_-by-1 categorical vector with the labels of the objects in the corresponding image from the batch.
Prediction confidence scores, returned as an _M_-by-1 numeric vector for a single image or a B_-by-1 cell array for a batch of_B images. M is the number of predicted object instances in the input image I. A higher score indicates higher confidence in the object instance segmentation.
For a batch of B images, each cell of the_B_-by-1 cell array contains an _M_-by-1 numeric vector with the confidence scores for the object segmentation predictions in the corresponding image from the batch.
Predicted instance segmentation results, returned as a FileDatastore object. The function organizes the datastore so that calling the read and readall functions on it returns a cell array with three columns. This table describes the format of each cell in each column.
masks | labels | scores |
---|---|---|
Binary masks, returned as a logical array of size_H_-by-_W_-by-M, where M is the number of predicted object instances in the corresponding image. Each mask is the segmentation of one object instance in the image. | Object class names, returned as an _M_-by-1 categorical vector, where M is the number of predicted object instances in the corresponding image. All categorical data returned by the datastore contains the same categories. | Prediction scores, returned as an _M_-by-1 numeric vector, where M is the number of predicted object instances in the corresponding image. |
Extended Capabilities
Usage notes and limitations:
- For code generation, the segmentObjects function does not support
WriteLocations
,NamePrefix
, andVerbose
name-value arguments. - For code generation, the
MiniBatchSize
name-value argument must be a code generation constant (coder.const()
). - For code generation, the
MaxNumKernels
name-value argument must be specified as a non-default value.
Usage notes and limitations:
- For code generation, the segmentObjects function does not support
WriteLocations
,NamePrefix
, andVerbose
name-value arguments. - For code generation, the
MiniBatchSize
name-value argument must be a code generation constant (coder.const()
). - For code generation, the
MaxNumKernels
name-value argument must be specified as a non-default value.
Version History
Introduced in R2023b
For code generation, the segmentObjects
function now requires you to specify the maximum number of convolution kernels to use for mask prediction using the newMaxNumKernels
name-value argument.
segmentObjects
now supports the generation of C code (requires MATLAB® Coder™) and optimized CUDA code (requires GPU Coder™).