segmentObjects - Segment objects using Mask R-CNN instance segmentation - MATLAB (original) (raw)

Segment objects using Mask R-CNN instance segmentation

Since R2021b

Syntax

Description

[masks](#mw%5Fa9899d3b-d637-4834-85c9-fe5cfae7f8af%5Fsep%5Fmw%5F3f6f4b8b-0bd4-44a3-aafa-522d6c430d5c) = segmentObjects([detector](#mw%5Fa9899d3b-d637-4834-85c9-fe5cfae7f8af%5Fsep%5Fmw%5F9ae4ef44-6a15-4603-aac7-56208411e207),[I](#mw%5Fa9899d3b-d637-4834-85c9-fe5cfae7f8af%5Fsep%5Fmw%5Fc42f62bd-bea4-474a-96a3-f99843001dde)) detects object masks within a single image or an array of images, I, using a Mask R-CNN object detector.

[[masks](#mw%5Fa9899d3b-d637-4834-85c9-fe5cfae7f8af%5Fsep%5Fmw%5F3f6f4b8b-0bd4-44a3-aafa-522d6c430d5c),[labels](#mw%5Fa9899d3b-d637-4834-85c9-fe5cfae7f8af%5Fsep%5Fmw%5F56f8c7ba-f1e4-47cf-bb0c-f1f40d965da3)] = segmentObjects([detector](#mw%5Fa9899d3b-d637-4834-85c9-fe5cfae7f8af%5Fsep%5Fmw%5F9ae4ef44-6a15-4603-aac7-56208411e207),[I](#mw%5Fa9899d3b-d637-4834-85c9-fe5cfae7f8af%5Fsep%5Fmw%5Fc42f62bd-bea4-474a-96a3-f99843001dde)) also returns the labels assigned to the detected objects.

[[masks](#mw%5Fa9899d3b-d637-4834-85c9-fe5cfae7f8af%5Fsep%5Fmw%5F3f6f4b8b-0bd4-44a3-aafa-522d6c430d5c),[labels](#mw%5Fa9899d3b-d637-4834-85c9-fe5cfae7f8af%5Fsep%5Fmw%5F56f8c7ba-f1e4-47cf-bb0c-f1f40d965da3),[scores](#mw%5Fa9899d3b-d637-4834-85c9-fe5cfae7f8af%5Fsep%5Fmw%5F223e2c94-1f5b-4753-9d06-fd1bb78f705c)] = segmentObjects([detector](#mw%5Fa9899d3b-d637-4834-85c9-fe5cfae7f8af%5Fsep%5Fmw%5F9ae4ef44-6a15-4603-aac7-56208411e207),[I](#mw%5Fa9899d3b-d637-4834-85c9-fe5cfae7f8af%5Fsep%5Fmw%5Fc42f62bd-bea4-474a-96a3-f99843001dde)) also returns the detection score for each of the detected objects.

[[masks](#mw%5Fa9899d3b-d637-4834-85c9-fe5cfae7f8af%5Fsep%5Fmw%5F3f6f4b8b-0bd4-44a3-aafa-522d6c430d5c),[labels](#mw%5Fa9899d3b-d637-4834-85c9-fe5cfae7f8af%5Fsep%5Fmw%5F56f8c7ba-f1e4-47cf-bb0c-f1f40d965da3),[scores](#mw%5Fa9899d3b-d637-4834-85c9-fe5cfae7f8af%5Fsep%5Fmw%5F223e2c94-1f5b-4753-9d06-fd1bb78f705c),[bboxes](#mw%5Fa9899d3b-d637-4834-85c9-fe5cfae7f8af%5Fsep%5Fmw%5F8fc83504-27f3-4927-9226-ab8247048f0a)] = segmentObjects([detector](#mw%5Fa9899d3b-d637-4834-85c9-fe5cfae7f8af%5Fsep%5Fmw%5F9ae4ef44-6a15-4603-aac7-56208411e207),[I](#mw%5Fa9899d3b-d637-4834-85c9-fe5cfae7f8af%5Fsep%5Fmw%5Fc42f62bd-bea4-474a-96a3-f99843001dde)) also returns the location of segmented object as bounding boxes,bboxes.

example

[dsResults](#mw%5Fa9899d3b-d637-4834-85c9-fe5cfae7f8af%5Fsep%5Fmw%5F08ef1f52-77aa-4c97-a740-ff01d33032c3) = segmentObjects([detector](#mw%5Fa9899d3b-d637-4834-85c9-fe5cfae7f8af%5Fsep%5Fmw%5F9ae4ef44-6a15-4603-aac7-56208411e207),[imds](#mw%5Fa9899d3b-d637-4834-85c9-fe5cfae7f8af%5Fsep%5Fmw%5F570b9681-5a6c-408a-81b4-4306ae92959a)) performs instance segmentation of images in a datastore using a Mask R-CNN object detector. The function returns a datastore with the instance segmentation results, including the instance masks, labels, detection scores, and bounding boxes.

example

[___] = segmentObjects(___,[Name=Value](#namevaluepairarguments)) configures the segmentation using additional name-value arguments. For example,segmentObjects(detector,I,Threshold=0.9) specifies the detection threshold as 0.9.

Note

This function requires the Computer Vision Toolbox™ Model for Mask R-CNN Instance Segmentation. You can install the Computer Vision Toolbox Model for Mask R-CNN Instance Segmentation from Add-On Explorer. For more information about installing add-ons, seeGet and Manage Add-Ons. To run this function, you will require the Deep Learning Toolbox™.

Examples

collapse all

Load a pretrained Mask R-CNN object detector.

detector = maskrcnn("resnet50-coco")

detector = maskrcnn with properties:

  ModelName: 'maskrcnn'
 ClassNames: {1×80 cell}
  InputSize: [800 1200 3]
AnchorBoxes: [15×2 double]

Read a test image that includes objects that the network can detect, such as people.

I = imread("visionteam.jpg");

Segment instances of objects using the Mask R-CNN object detector.

[masks,labels,scores,boxes] = segmentObjects(detector,I,Threshold=0.95);

Overlay the detected object masks in blue on the test image. Display the bounding boxes in red and the object labels.

overlayedImage = insertObjectMask(I,masks); imshow(overlayedImage) showShape("rectangle",boxes,Label=labels,LineColor=[1 0 0])

Load a pretrained Mask R-CNN object detector.

detector = maskrcnn("resnet50-coco");

Create a datastore of test images.

imageFiles = fullfile(toolboxdir("vision"),"visiondata","visionteam*.jpg"); dsTest = imageDatastore(imageFiles);

Segment instances of objects using the Mask R-CNN object detector.

dsResults = segmentObjects(detector,dsTest);

Running Mask R-CNN network

Processed 2 images.

For each test image, display the instance segmentation results. Overlay the detected object masks in blue on the test image. Display the bounding boxes in red and the object labels.

while(hasdata(dsResults)) testImage = read(dsTest); results = read(dsResults); overlayedImage = insertObjectMask(testImage,results{1}); figure imshow(overlayedImage) showShape("rectangle",results{4},Label=results{2},LineColor=[1 0 0]) end

Figure contains an axes object. The axes object contains an object of type image.

Input Arguments

collapse all

Mask R-CNN object detector, specified as a maskrcnn object.

Image or batch of images to segment, specified as one of these values.

Image Type	Data Format
Single grayscale image	2-D matrix of size _H_-by-W
Single color image	3-D array of size_H_-by-_W_-by-3.
Batch of B grayscale or color images	4-D array of size_H_-by-_W_-by-_C_-by-B. The number of color channels C is 1 for grayscale images and 3 for color images.

The height H and width W of each image must be greater than or equal to the input height h and width_w_ of the network.

Datastore of images, specified as a datastore such as an imageDatastore or a CombinedDatastore. If calling the datastore with the read function returns a cell array, then the image data must be in the first cell.

Name-Value Arguments

expand all

Specify optional pairs of arguments asName1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Example: segmentObjects(detector,I,Threshold=0.9) specifies the detection threshold as 0.9.

Options for All Image Formats

expand all

Detection threshold, specified as a numeric scalar in the range [0, 1]. The Mask R-CNN object detector does not return detections with scores less than the threshold value. Increase this value to reduce false positives.

Maximum number of strongest region proposals, specified as a positive integer. Reduce this value to speed up processing time at the cost of detection accuracy. To use all region proposals, specify this value as Inf.

Select the strongest bounding box for each detected object, specified as a numeric or logical 1 (true) or 0 (false).

true — Return the strongest bounding box per object. To select these boxes, the segmentObjects function calls theselectStrongestBboxMulticlass function, which uses nonmaximal suppression to eliminate overlapping bounding boxes based on their confidence scores.
false — Return all detected bounding boxes. You can then create your own custom operation to eliminate overlapping bounding boxes.

Minimum size of a region containing an object, in pixels, specified as a two-element numeric vector of the form [height _width_]. By default, MinSize is the smallest object that the trained detector can detect. Specify this argument to reduce the computation time.

Maximum size of a region containing an object, in pixels, specified as a two-element numeric vector of the form [height _width_].

To reduce computation time, set this value to the known maximum region size for the objects being detected in the image. By default, MaxSize is set to the height and width of the input image, I.

Hardware resource for processing images with a network, specified as"auto", "gpu", or "cpu".

ExecutionEnvironment	Description
"auto"	Use a GPU if available. Otherwise, use the CPU. The use of GPU requires Parallel Computing Toolbox™ and a CUDA® enabled NVIDIA® GPU. For information about the supported compute capabilities, see GPU Computing Requirements (Parallel Computing Toolbox).
"gpu"	Use the GPU. If a suitable GPU is not available, the function returns an error message.
"cpu"	Use the CPU.

Options for Datastore Inputs

expand all

Number of observations that are returned in each batch. The default value is equal to the ReadSize property of datastoreimds.

You can specify this argument only when you specify a batch of images,I, or datastore of images, imds, as an input to the segmentObjects function.

Location to place writable data, specified as a string scalar or character vector. The specified folder must have write permissions. If the folder already exists, thesegmentObjects function creates a new folder and adds a suffix to the folder name with the next available number. The default write location isfullfile(pwd,"SegmentObjectResults") where pwd is the current working directory.

You can specify this argument only when you specify a datastore of images,imds.

Data Types: char | string

Prefix added to written filenames, specified as a string scalar or character vector. The files are named <NamePrefix>_<imageName>.mat, where imageName is the name of the input image without its extension.

You can specify this argument only when you specify a datastore of images,imds.

Data Types: char | string

Enable progress display to screen, specified as a numeric or logical1 (true) or 0 (false).

You can specify this argument only when you specify a datastore of images,imds.

Output Arguments

collapse all

Objects masks, returned as a logical array of size_H_-by-_W_-by-M.H and W are the height and width of the input image I. M is the number of objects detected in the image. Each of the M channels contains the mask for a single detected object.

When I represents a batch of B images,masks is returned as a _B_-by-1 cell array. Each element in the cell array indicates the masks for the corresponding input image in the batch.

Objects labels, returned as an M_-by-1 categorical vector where_M is the number of detected objects in imageI.

When I represents a batch of B images, thenlabels is a _B_-by-1 cell array. Each element is an _M_-by-1 categorical vector with the labels of the objects in the corresponding image.

Detection confidence scores, returned as an _M_-by-1 numeric vector, where M is the number of detected objects in imageI. A higher score indicates higher confidence in the detection.

When I represents a batch of B images, thenscores is a _B_-by-1 cell array. Each element is an _M_-by-1 numeric vector with the labels of the objects in the corresponding image.

Location of detected objects within the input image, returned as an_M_-by-4 matrix, where M is the number of detected objects in image I. Each row of bboxes contains a four-element vector of the form [x y width _height_]. This vector specifies the upper left corner and size of that corresponding bounding box in pixels.

When I represents a batch of B images, thenbboxes is a _B_-by-1 cell array. Each element is an _M_-by-4 numeric matrix with the bounding boxes of the objects in the corresponding image.

Predicted instance segmentation results, returned as a FileDatastore object. The datastore is set up so that calling the datastore with the read and readall functions returns a cell array with four columns. This table describes the format of each column.

data	boxes	labels	masks
RGB image that serves as a network input, returned as an_H_-by-_W_-by-3 numeric array.	Bounding boxes, returned as _M_-by-4 matrices, where M is the number of objects within the image. Each bounding box has the format [x y width _height_], where [x,_y_] represent the top-left coordinates of the bounding box.	Object class names, returned as an _M_-by-1 categorical vector. All categorical data returned by the datastore must contain the same categories.	Binary masks, returned as a logical array of size_H_-by-_W_-by-M. Each mask is the segmentation of one instance in the image.

Version History

Introduced in R2021b

expand all

The segmentObjects function now supports specifying test images as a datastore. New name-value arguments enable more options to configure the segmentation of images in a datastore. When you specify test images as a datastore, the function returns all instance segmentation results as a file datastore.