preprocess - Preprocess training and test images - MATLAB (original) (raw)
Preprocess training and test images
Since R2021a
Syntax
Description
[outputData](#mw%5Fdf2fa24d-e8a3-4f07-90af-cf0baf7883cd) = preprocess([detector](#mw%5F2f18f1b4-b974-49ce-bdfb-97387b91859b),[trainingData](#mw%5F85dce6f5-b7bc-4429-bf28-6ee56342ed01))
preprocesses the training data trainingData
before using it to train the YOLO v3 object detector. The training images and the corresponding bounding boxes are stored in the trainingData
. The preprocess
function performs these operations:
- Rescales the intensity values of the training images to the range [0, 1].
- Resizes the training images to one of the nearest network input sizes and updates the bounding box coordinate values for accurate training. The function preserves the original aspect ratio of the training data.
[outputImg](#mw%5F1b074622-c59d-43a1-a997-610a6b824e74) = preprocess([detector](#mw%5F2f18f1b4-b974-49ce-bdfb-97387b91859b),[img](#mw%5F1ff43d0c-3630-40ab-a15f-cb46bccc6f51))
preprocesses the test images img
for object detection using a YOLO v3 object detector. The preprocess
function performs these operations:
- Rescales the intensity values of the test images to the range [0, 1].
- Resizes the test images to one of the nearest network input sizes and preserves the original aspect ratio of each test image.
[___,[scaleInfo](#mw%5Ff47453fd-9609-4044-8536-079e848a779f)] = preprocess(___)
returns information on the scale factor applied for image resizing, in addition to any combination of arguments from previous syntaxes.
Examples
Load a pretrained YOLO v3 object detector.
detector = yolov3ObjectDetector('tiny-yolov3-coco');
Load the training dataset into the workspace. The training data is a cell array that contains three training images and the corresponding bounding box values and class labels.
load('trainingData.mat','trainingData');
Resize the training images to the nearest network input size and rescale the intensity values by using the preprocess
function.
[outputData,info] = preprocess(detector,trainingData);
Display the output images and the scale information used for resizing the images.
figure montage({outputData{1,1},outputData{2,1},outputData{3,1}},Size=[1 3]) title("Preprocessed Output Image")
ScaleX = [info{1,1}.ScaleX;info{2,1}.ScaleX;info{3,1}.ScaleX]; ScaleY = [info{1,1}.ScaleY;info{2,1}.ScaleY;info{3,1}.ScaleY]; table(ScaleX,ScaleY)
ans=3×2 table
ScaleX ScaleY
_________ _________
0.0072115 0.0072115
0.0072115 0.0072115
0.0072115 0.0072115
Display the input and the preprocessed image size and bounding box values.
bboxIn = cell2table(trainingData,'VariableNames',{'Images','Bounding Boxes','Labels'})
bboxIn=3×3 table
Images Bounding Boxes Labels
_________________ ________________________ ___________
{224×399×3 uint8} 220 136 35 28 {'vehicle'}
{224×399×3 uint8} 175 126 61 45 {'vehicle'}
{224×399×3 uint8} 108 120 45 33 {'vehicle'}
bboxOut = cell2table(outputData,'VariableNames',{'Images','Bounding Boxes','Labels'})
bboxOut=3×3 table
Images Bounding Boxes Labels
__________________ ________________________ ___________
{416×416×3 single} 229 232 36 29 {'vehicle'}
{416×416×3 single} 182 222 64 46 {'vehicle'}
{416×416×3 single} 112 215 47 35 {'vehicle'}
Load a pretrained YOLO v3 object detector.
detector = yolov3ObjectDetector('tiny-yolov3-coco');
Read a test image.
I = imread('highway.png');
Resize the test image to the network input size and rescale the intensity values by using the preprocess
function.
[outputImg,scaleInfo] = preprocess(detector,I);
Display the output image and the scale information used for resizing the image.
ScaleX: 0.7692
ScaleY: 0.5769
PreprocessedImageSize: [416 416 3]
Input Arguments
Training data for YOLO v3 object detector, specified as a _N_-by-3 cell array that contains the images, bounding boxes, and the class labels. Each row is of the form [images bounding boxes _labels_]. N is the number of output layers in the network. The bounding boxes must be stored as a _K_-by-4 matrix of the form [x y width _height_]. K is the number of object classes.
Test images, specified as a numeric array of size_M_-by-_N_-by-C or_M_-by-_N_-by-C_-by-T.M is the number of rows, N is the number of columns, and C is the number of color channels. The value of_C is 1
for grayscale images and3
for RGB color images. T is the number of test images in the array.
Data Types: single
| double
| int8
| int16
| int32
| int64
| uint8
| uint16
| uint32
| uint64
Output Arguments
Preprocessed training data, returned as a _N_-by-3 cell array.
Data Types: cell
Preprocessed test images, returned as a numeric array of size_P_-by-_Q_-by-C or_P_-by-_Q_-by-_C_-by-T.P and Q are the number of rows and columns, respectively, in the preprocessed image.
Data Types: single
| double
| int8
| int16
| int32
| int64
| uint8
| uint16
| uint32
| uint64
Information about the scale factor for resizing the input images, returned as a structure with fields PreprocessedImageSize
,ScaleX
, and ScaleY
.
PreprocessedImageSize
— Size of the output resized image.ScaleX
— Scale factor for resizing an image in the_X_-direction (along the rows).ScaleY
— Scale factor for resizing an image in the_Y_-direction (along the columns).
Data Types: struct
Version History
Introduced in R2021a