augmentedImageDatastore - Transform batches to augment image data - MATLAB (original) (raw)

Transform batches to augment image data

Description

An augmented image datastore transforms batches of training, validation, test, and prediction data, with optional preprocessing such as resizing, rotation, and reflection. Resize images to make them compatible with the input size of your deep learning network. Augment training image data with randomized preprocessing operations to help prevent the network from overfitting and memorizing the exact details of the training images.

To train a network using augmented images, supply theaugmentedImageDatastore to the trainnet function. For more information, see Preprocess Images for Deep Learning.

By default, an augmentedImageDatastore only resizes images to fit the output size. You can configure options for additional image transformations using animageDataAugmenter.

Creation

Syntax

Description

auimds = augmentedImageDatastore([outputSize](#mw%5F28733dce-0e89-480c-b4d8-7ad0ca8fdd27),[imds](#mw%5Fb2ab8e2e-ead7-4682-8392-110a389ee06c)) creates an augmented image datastore for classification problems using images from image datastore imds. The datastore resizes images to the height and width specified by outputSize.

auimds = augmentedImageDatastore([outputSize](#mw%5F28733dce-0e89-480c-b4d8-7ad0ca8fdd27),[X](#mw%5Fe3105353-662c-4e21-8caa-5eae17d2e955),[Y](#d126e18094)) creates an augmented image datastore for classification and regression problems. The array X contains the predictor variables and the arrayY contains the categorical labels or numeric responses.

auimds = augmentedImageDatastore([outputSize](#mw%5F28733dce-0e89-480c-b4d8-7ad0ca8fdd27),[X](#mw%5Fe3105353-662c-4e21-8caa-5eae17d2e955)) creates an augmented image datastore for predicting responses of image data in array X.

auimds = augmentedImageDatastore([outputSize](#mw%5F28733dce-0e89-480c-b4d8-7ad0ca8fdd27),[tbl](#mw%5F59974969-7bbc-46a4-8300-cbc042c72695)) creates an augmented image datastore for classification and regression problems. The table, tbl, contains predictors and responses.

auimds = augmentedImageDatastore([outputSize](#mw%5F28733dce-0e89-480c-b4d8-7ad0ca8fdd27),[tbl](#mw%5F59974969-7bbc-46a4-8300-cbc042c72695),[responseNames](#mw%5F5553032b-8939-416b-bfc2-eeb85852b10c)) creates an augmented image datastore for classification and regression problems. The table, tbl, contains predictors and responses. TheresponseNames argument specifies the response variables in tbl.

auimds = augmentedImageDatastore(___,[Name=Value](#namevaluepairarguments)) also sets writable properties using name-value arguments. For example,augmentedImageDatastore([28,28],imds,OutputSizeMode="centercrop") creates an augmented image datastore that crops images from the center.

example

Input Arguments

expand all

Size of output images, specified as a vector of two positive integers. The first element specifies the height (number of rows) in the output images, and the second element specifies the width (number of columns).

The output images can have a third dimension that represents the color channels. However, if you specify outputSize as a three-element vector, then the datastore ignores the third element. Instead, the datastore determines the image size in the third dimension in one of these ways:

This argument sets the OutputSize property.

Images, specified as a 4-D numeric array. The first three dimensions are the height, width, and channels, and the last dimension indexes the individual images.

Data Types: single | double | uint8 | int8 | uint16 | int16 | uint32 | int32

Responses for classification or regression, specified as one of the following:

Responses must not contain NaNs.

Data Types: categorical | double

Input data, specified as a table. tbl must contain the predictors in the first column as either absolute or relative image paths or images. The type and location of the responses depend on the problem:

Responses must not contain NaN values. If there areNaNs in the predictor data, they are propagated through the training, however, in most cases the training fails to converge.

Data Types: table

Names of the response variables in the input table, specified as one of the following:

Data Types: char | cell | string

Name-Value Arguments

expand all

Specify optional pairs of arguments asName1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Example: auimds = augmentedImageDatastore([28,28],imds,OutputSizeMode="centercrop") creates an augmented image datastore that crops images from the center.

Preprocessing color operations performed on input grayscale or RGB images, specified as "none","gray2rgb", or "rgb2gray". When the image datastore contains a mixture of grayscale and RGB images, use ColorPreprocessing to ensure that all output images have the number of channels required by imageInputLayer.

Note

The augmentedImageDatastore object converts RGB images to grayscale by using the rgb2gray function. If an image has three channels that do not correspond to red, green, and blue channels (such as an image in the L*a*b* color space), then usingColorPreprocessing can give poor results.

The datastore does not perform color preprocessing when:

This argument sets the ColorPreprocessing property.

Data Types: char | string

Preprocessing applied to input images, specified as an imageDataAugmenter object or"none". WhenDataAugmentation is"none", the datastore only resizes images to fit the output size, and does not perform additional preprocessing.

This argument sets the DataAugmentation property.

Dispatch observations in the background during training, prediction, or classification, specified as false or true. To use background dispatching, you must have Parallel Computing Toolbox™.

Augmented image datastores only perform background dispatching when used with the trainnet function, and inference functions such aspredict and minibatchpredict. Background dispatching does not occur when you call the read function of the datastore directly.

This argument sets the DispatchInBackground property.

Method used to resize output images, specified as one of the following.

This argument sets the OutputSizeMode property.

Data Types: char | string

Properties

expand all

Preprocessing color operations performed on input grayscale or RGB images, specified as "none", "gray2rgb", or"rgb2gray". When the image datastore contains a mixture of grayscale and RGB images, useColorPreprocessing to ensure that all output images have the number of channels required by imageInputLayer.

Note

The augmentedImageDatastore object converts RGB images to grayscale by using the rgb2gray function. If an image has three channels that do not correspond to red, green, and blue channels (such as an image in the L*a*b* color space), then using ColorPreprocessing can give poor results.

The datastore does not perform color preprocessing when:

Data Types: char | string

Preprocessing applied to input images, specified as an imageDataAugmenter object or "none". WhenDataAugmentation is "none", the datastore only resizes images to fit the output size, and does not perform additional preprocessing.

Dispatch observations in the background during training, prediction, or classification, specified as false ortrue. To use background dispatching, you must have Parallel Computing Toolbox.

Augmented image datastores only perform background dispatching when used with the trainnet function, and inference functions such as predict andminibatchpredict. Background dispatching does not occur when you call the read function of the datastore directly.

Number of observations that are returned in each batch. You can change the value of MiniBatchSize only after you create the datastore.

Training and prediction functions that specify a mini-batch size, such astrainingOptions, minibatchpredict, and testnet, do not set the MiniBatchSize property. For best performance, use the same mini-batch size for your datastore as for your training and prediction functions.

This property is read-only.

Total number of observations in the augmented image datastore, returned as a positive integer. The number of observations is the length of one training epoch.

Size of output images, specified as a vector of two positive integers. The first element specifies the height (number of rows) in the output images, and the second element specifies the width (number of columns).

The OutputSize property does not indicate the number of color channels of the output images. When you read from the datastore, the output images can have a third dimension that represents the color channels.

Method used to resize output images, specified as one of the following.

Data Types: char | string

Object Functions

combine Combine data from multiple datastores
hasdata Determine if data is available to read
numpartitions Number of datastore partitions
partition Partition a datastore
partitionByIndex Partition augmentedImageDatastore according to indices
preview Preview subset of data in datastore
read Read data from augmentedImageDatastore
readall Read all data in datastore
readByIndex Read data specified by index fromaugmentedImageDatastore
reset Reset datastore to initial state
shuffle Shuffle data in augmentedImageDatastore
subset Create subset of datastore or FileSet
transform Transform datastore
isPartitionable Determine whether datastore is partitionable
isShuffleable Determine whether datastore is shuffleable

Examples

collapse all

Train a convolutional neural network using augmented image data. Data augmentation helps prevent the network from overfitting and memorizing the exact details of the training images.

Load the sample data, which consists of synthetic images of handwritten digits. XTrain is a 28-by-28-by-1-by-5000 array, where:

labelsTrain is a categorical vector containing the labels for each observation.

Set aside 1000 of the images for network validation.

idx = randperm(size(XTrain,4),1000); XValidation = XTrain(:,:,:,idx); XTrain(:,:,:,idx) = []; TValidation = labelsTrain(idx); labelsTrain(idx) = [];

Create an imageDataAugmenter object that specifies preprocessing options for image augmentation, such as resizing, rotation, translation, and reflection. Randomly translate the images up to three pixels horizontally and vertically, and rotate the images with an angle up to 20 degrees.

imageAugmenter = imageDataAugmenter( ... 'RandRotation',[-20,20], ... 'RandXTranslation',[-3 3], ... 'RandYTranslation',[-3 3])

imageAugmenter = imageDataAugmenter with properties:

       FillValue: 0
 RandXReflection: 0
 RandYReflection: 0
    RandRotation: [-20 20]
       RandScale: [1 1]
      RandXScale: [1 1]
      RandYScale: [1 1]
      RandXShear: [0 0]
      RandYShear: [0 0]
RandXTranslation: [-3 3]
RandYTranslation: [-3 3]

Create an augmentedImageDatastore object to use for network training and specify the image output size. During training, the datastore performs image augmentation and resizes the images. The datastore augments the images without saving any images to memory. trainnet updates the network parameters and then discards the augmented images.

imageSize = [28 28 1]; augimds = augmentedImageDatastore(imageSize,XTrain,labelsTrain,'DataAugmentation',imageAugmenter);

Specify the convolutional neural network architecture.

layers = [ imageInputLayer(imageSize)

convolution2dLayer(3,8,'Padding','same')
batchNormalizationLayer
reluLayer   

maxPooling2dLayer(2,'Stride',2)

convolution2dLayer(3,16,'Padding','same')
batchNormalizationLayer
reluLayer   

maxPooling2dLayer(2,'Stride',2)

convolution2dLayer(3,32,'Padding','same')
batchNormalizationLayer
reluLayer   

fullyConnectedLayer(10)
softmaxLayer];

Specify the training options. Choosing among the options requires empirical analysis. To explore different training option configurations by running experiments, you can use the Experiment Manager app.

opts = trainingOptions('sgdm', ... 'MaxEpochs',15, ... 'Shuffle','every-epoch', ... 'Plots','training-progress', ... 'Metrics','accuracy', ... 'Verbose',false, ... 'ValidationData',{XValidation,TValidation});

Train the neural network using the trainnet function. For classification, use cross-entropy loss. By default, the trainnet function uses a GPU if one is available. Training on a GPU requires a Parallel Computing Toolbox™ license and a supported GPU device. For information on supported devices, see GPU Computing Requirements (Parallel Computing Toolbox). Otherwise, the trainnet function uses the CPU. To specify the execution environment, use the ExecutionEnvironment training option.

net = trainnet(augimds,layers,"crossentropy",opts);

Tips

Version History

Introduced in R2018a