Run a Custom Training Experiment for Image Comparison - MATLAB & Simulink (original) (raw)

This example shows how to create a custom training experiment to train a twin neural network that identifies similar images of handwritten characters. For a custom training experiment, you explicitly define the training procedure used by Experiment Manager. In this example, you implement a custom training loop to train a twin neural network, a type of deep learning network that uses two or more identical subnetworks that have the same architecture and share the same parameters and weights. Some common applications for twin neural networks include facial recognition, signature verification, and paraphrase identification.

This diagram illustrates the twin neural network architecture in this example.

To compare two images, you pass each image through one of two identical subnetworks that share weights. The subnetworks convert each 105-by-105-by-1 image to a 4096-dimensional feature vector. Images of the same class have similar 4096-dimensional representations. The output feature vectors from each subnetwork are combined through subtraction and the result is passed through a fullyconnect operation with a single output. A sigmoid operation converts this value to a probability indicating that the images are similar (when the probability is close to 1) or dissimilar (when the probability is close to 0). The binary cross-entropy loss between the network prediction and the true label updates the network during training. For more information, see Train a Twin Neural Network to Compare Images.

Open Experiment

First, open the example. Experiment Manager loads a project with a preconfigured experiment that you can inspect and run. To open the experiment, in the Experiment Browser pane, double-click ImageComparisonExperiment.

Custom training experiments consist of a description, a table of hyperparameters, and a training function. For more information, see Train Network Using Custom Training Loop and Display Visualization.

The Description field contains a textual description of the experiment. For this example, the description is:

Train a twin neural network to identify similar and dissimilar images of handwritten characters. Try different weight and bias initializers for the convolution and fully connected layers in the network.

The Hyperparameters section specifies the strategy and hyperparameter values to use for the experiment. When you run the experiment, Experiment Manager trains the network using every combination of hyperparameter values specified in the hyperparameter table. This example uses the hyperparameters WeightsInitializer and BiasInitializer to specify the weight and bias initializers, respectively, for the convolution and fully connected layers in each subnetwork. For more information about these initializers, see [WeightsInitializer](../ref/nnet.cnn.layer.convolution2dlayer.html#mw%5F2d97b6cd-f8aa-4fad-88d6-d34875484820%5Fsep%5Fmw%5Fbec5cf10-a8e8-4560-be5a-c4ccb6594b02) and [BiasInitializer](../ref/nnet.cnn.layer.convolution2dlayer.html#mw%5F2d97b6cd-f8aa-4fad-88d6-d34875484820%5Fsep%5Fmw%5F2a493962-3967-49a4-90df-d4afeec93fc0).

The Training Function section specifies a function that defines the training data, network architecture, training options, and training procedure used by the experiment. To open this function in MATLAB® Editor, click Edit. The code for the function also appears in Training Function. The input to the training function is a structure with fields from the hyperparameter table and an experiments.Monitor object that you can use to track the progress of the training, record values of the metrics used by the training, and produce training plots. The function returns a structure that contains the trained network, the weights for the final fullyconnect operation for the network, and the execution environment used for training. Experiment Manager saves this output so you can export it to the MATLAB workspace when the training is complete. The training function has these sections:

output.network = []; output.weights = []; output.executionEnvironment = "auto";

monitor.Status = "Loading Training Data";

url = "https://github.com/brendenlake/omniglot/raw/master/python/images_background.zip"; downloadFolder = tempdir; filename = fullfile(downloadFolder,"images_background.zip");

dataFolderTrain = fullfile(downloadFolder,"images_background"); if ~exist(dataFolderTrain,"dir") websave(filename,url); unzip(filename,downloadFolder); end

imdsTrain = imageDatastore(dataFolderTrain, ... IncludeSubfolders=true, ... LabelSource="none");

files = imdsTrain.Files; parts = split(files,filesep); labels = join(parts(:,(end-2):(end-1)),"_"); imdsTrain.Labels = categorical(labels);

monitor.Status = "Loading Test Data";

url = "https://github.com/brendenlake/omniglot/raw/master/python/images_evaluation.zip"; filename = fullfile(downloadFolder,"images_evaluation.zip");

dataFolderTest = fullfile(downloadFolder,"images_evaluation"); if ~exist(dataFolderTest,"dir") websave(filename,url); unzip(filename,downloadFolder); end

imdsTest = imageDatastore(dataFolderTest, ... IncludeSubfolders=true, ... LabelSource="none");

files = imdsTest.Files; parts = split(files,filesep); labels = join(parts(:,(end-2):(end-1)),"_"); imdsTest.Labels = categorical(labels);

monitor.Status = "Creating Network";

layers = [ imageInputLayer([105 105 1],Normalization="none") convolution2dLayer(10,64, ... WeightsInitializer=params.WeightsInitializer, ... BiasInitializer=params.BiasInitializer) reluLayer() maxPooling2dLayer(2,Stride=2) convolution2dLayer(7,128, ... WeightsInitializer=params.WeightsInitializer, ... BiasInitializer=params.BiasInitializer) reluLayer() maxPooling2dLayer(2,Stride=2) convolution2dLayer(4,128, ... WeightsInitializer=params.WeightsInitializer, ... BiasInitializer=params.BiasInitializer) reluLayer() maxPooling2dLayer(2,Stride=2) convolution2dLayer(5,256, ... WeightsInitializer=params.WeightsInitializer, ... BiasInitializer=params.BiasInitializer) reluLayer() fullyConnectedLayer(4096, ... WeightsInitializer=params.WeightsInitializer, ... BiasInitializer=params.BiasInitializer)];

lgraph = layerGraph(layers); net = dlnetwork(lgraph);

fcWeights = dlarray(0.01randn(1,4096)); fcBias = dlarray(0.01randn(1,1)); fcParams = struct(... "FcWeights",fcWeights,... "FcBias",fcBias);

output.network = net; output.weights = fcParams;

numIterations = 1000; miniBatchSize = 180; validationFrequency = 100; initialLearnRate = 6e-5; gradientDecayFactor = 0.9; squaredGradientDecayFactor = 0.99; trailingAvgSubnet = []; trailingAvgSqSubnet = []; trailingAvgParams = []; trailingAvgSqParams = [];

monitor.Metrics = ["TrainingLoss" "ValidationAccuracy"]; monitor.XLabel = "Iteration"; monitor.Status = "Training";

for iteration = 1:numIterations [X1,X2,pairLabels] = getTwinBatch(imdsTrain,miniBatchSize);

X1 = dlarray(single(X1),"SSCB");
X2 = dlarray(single(X2),"SSCB");

if (output.executionEnvironment == "auto" && canUseGPU) || ...
        output.executionEnvironment == "gpu"
    X1 = gpuArray(X1);
    X2 = gpuArray(X2);
end

[loss,gradientsSubnet,gradientsParams] = dlfeval(@modelLoss, ...
    net,fcParams,X1,X2,pairLabels);
lossValue = double(gather(extractdata(loss)));

[net,trailingAvgSubnet,trailingAvgSqSubnet] = ...
    adamupdate(net,gradientsSubnet, ...
    trailingAvgSubnet,trailingAvgSqSubnet, ...
    iteration,initialLearnRate,gradientDecayFactor,squaredGradientDecayFactor);

[fcParams,trailingAvgParams,trailingAvgSqParams] = ...
    adamupdate(fcParams,gradientsParams, ...
    trailingAvgParams,trailingAvgSqParams, ...
    iteration,initialLearnRate,gradientDecayFactor,squaredGradientDecayFactor);
 
if ~rem(iteration,validationFrequency) || iteration == 1 || iteration == numIterations
    monitor.Status = "Validating";
    accuracy = zeros(1,5);
    accuracyBatchSize = 150;

    for i = 1:5
        [XAcc1,XAcc2,pairLabelsAcc] = getTwinBatch(imdsTest,accuracyBatchSize);

        XAcc1 = dlarray(single(XAcc1),"SSCB");
        XAcc2 = dlarray(single(XAcc2),"SSCB");

        if (output.executionEnvironment == "auto" && canUseGPU) || ...
                output.executionEnvironment == "gpu"
            XAcc1 = gpuArray(XAcc1);
            XAcc2 = gpuArray(XAcc2);
        end

        Y = predictTwin(net,fcParams,XAcc1,XAcc2);
        Y = round(Y);

        accuracy(i) = sum(Y == pairLabelsAcc)/accuracyBatchSize;
    end

    recordMetrics(monitor,iteration, ...
        ValidationAccuracy=mean(accuracy)*100);
    monitor.Status = "Training";
end
 
output.network = net;
output.weights = fcParams;
recordMetrics(monitor,iteration, ...
    TrainingLoss=lossValue);
monitor.Progress = (iteration/numIterations)*100;

if monitor.Stop
    return;
end

end

testBatchSize = 10;

[XTest1,XTest2,pairLabelsTest] = getTwinBatch(imdsTest,testBatchSize);

XTest1 = dlarray(single(XTest1),"SSCB"); XTest2 = dlarray(single(XTest2),"SSCB");

if (output.executionEnvironment == "auto" && canUseGPU) || ... output.executionEnvironment == "gpu" XTest1 = gpuArray(XTest1); XTest2 = gpuArray(XTest2); end

YScore = predictTwin(net,fcParams,XTest1,XTest2); YScore = gather(extractdata(YScore));

YPred = round(YScore);

XTest1 = extractdata(XTest1); XTest2 = extractdata(XTest2);

figure(Name="Test Images"); title(tiledlayout(2,5), ... "Comparison of Test Images")

for i = 1:numel(pairLabelsTest) if pairLabelsTest(i) == YPred(i) titleStr = "Correct"; titleColor = "#77AC30"; % dark green else titleStr = "Incorrect"; titleColor = "#FF0000"; % red end

if YPred(i) == 1
    predStr = "Predicted: Similar";
else
    predStr = "Predicted: Dissimilar" ;
end

scoreStr = "Score: " + YScore(i);

nexttile        
imshow([XTest1(:,:,:,i) XTest2(:,:,:,i)]);
imageTitle = title(titleStr,[predStr,scoreStr]);
imageTitle.Color = titleColor;
xticks([])
yticks([])

end

Run Experiment

When you run the experiment, Experiment Manager trains the network defined by the training function multiple times. Each trial uses a different combination of hyperparameter values. By default, Experiment Manager runs one trial at a time. If you have Parallel Computing Toolbox, you can run multiple trials at the same time or offload your experiment as a batch job in a cluster:

A table of results displays the training loss and validation accuracy for each trial.

To display the training plot and track the progress of each trial while the experiment is running, under Review Results, click Training Plot.

Evaluate Results

To find the best result for your experiment, sort the table of results by validation accuracy:

  1. Point to the ValidationAccuracy column.
  2. Click the triangle icon.
  3. Select Sort in Descending Order.

The trial with the highest validation accuracy appears at the top of the results table.

To visually check that the network correctly identifies similar and dissimilar pairs, select the top row in the results table and, under Review Results, click Test Images. Experiment Manager displays ten randomly selected pairs of test images with the prediction from the trained network, the probability score, and a label indicating whether the prediction is correct or incorrect.

To record observations about the results of your experiment, add an annotation:

  1. In the results table, right-click the ValidationAccuracy cell of the best trial.
  2. Select Add Annotation.
  3. In the Annotations pane, enter your observations in the text box.

Close Experiment

In the Experiment Browser pane, right-click OmniglotImageComparisonProject and select Close Project. Experiment Manager closes the experiment and results contained in the project.

Training Function

This function specifies the training data, network architecture, training options, and training procedure used by the experiment. The input to this function is a structure with fields from the hyperparameter table and an experiments.Monitor object that you can use to track the progress of the training, record values of the metrics used by the training, and produce training plots. The training function returns a structure that contains the trained network, the weights for the final fullyconnect operation for the network, and the execution environment used for training. Experiment Manager saves this output so you can export it to the MATLAB workspace when the training is complete.

function output = ImageComparisonExperiment_training(params,monitor)

Initialize Output

output.network = []; output.weights = []; output.executionEnvironment = "auto";

Load and Preprocess Training and Test Data

monitor.Status = "Loading Training Data";

url = "https://github.com/brendenlake/omniglot/raw/master/python/images_background.zip"; downloadFolder = tempdir; filename = fullfile(downloadFolder,"images_background.zip");

dataFolderTrain = fullfile(downloadFolder,"images_background"); if ~exist(dataFolderTrain,"dir") websave(filename,url); unzip(filename,downloadFolder); end

imdsTrain = imageDatastore(dataFolderTrain, ... IncludeSubfolders=true, ... LabelSource="none");

files = imdsTrain.Files; parts = split(files,filesep); labels = join(parts(:,(end-2):(end-1)),"_"); imdsTrain.Labels = categorical(labels);

monitor.Status = "Loading Test Data";

url = "https://github.com/brendenlake/omniglot/raw/master/python/images_evaluation.zip"; filename = fullfile(downloadFolder,"images_evaluation.zip");

dataFolderTest = fullfile(downloadFolder,"images_evaluation"); if ~exist(dataFolderTest,"dir") websave(filename,url); unzip(filename,downloadFolder); end

imdsTest = imageDatastore(dataFolderTest, ... IncludeSubfolders=true, ... LabelSource="none");

files = imdsTest.Files; parts = split(files,filesep); labels = join(parts(:,(end-2):(end-1)),"_"); imdsTest.Labels = categorical(labels);

Define Network Architecture

monitor.Status = "Creating Network";

layers = [ imageInputLayer([105 105 1],Normalization="none") convolution2dLayer(10,64, ... WeightsInitializer=params.WeightsInitializer, ... BiasInitializer=params.BiasInitializer) reluLayer() maxPooling2dLayer(2,Stride=2) convolution2dLayer(7,128, ... WeightsInitializer=params.WeightsInitializer, ... BiasInitializer=params.BiasInitializer) reluLayer() maxPooling2dLayer(2,Stride=2) convolution2dLayer(4,128, ... WeightsInitializer=params.WeightsInitializer, ... BiasInitializer=params.BiasInitializer) reluLayer() maxPooling2dLayer(2,Stride=2) convolution2dLayer(5,256, ... WeightsInitializer=params.WeightsInitializer, ... BiasInitializer=params.BiasInitializer) reluLayer() fullyConnectedLayer(4096, ... WeightsInitializer=params.WeightsInitializer, ... BiasInitializer=params.BiasInitializer)];

lgraph = layerGraph(layers); net = dlnetwork(lgraph);

fcWeights = dlarray(0.01randn(1,4096)); fcBias = dlarray(0.01randn(1,1)); fcParams = struct(... "FcWeights",fcWeights,... "FcBias",fcBias);

output.network = net; output.weights = fcParams;

Specify Training Options

numIterations = 1000; miniBatchSize = 180; validationFrequency = 100; initialLearnRate = 6e-5; gradientDecayFactor = 0.9; squaredGradientDecayFactor = 0.99; trailingAvgSubnet = []; trailingAvgSqSubnet = []; trailingAvgParams = []; trailingAvgSqParams = [];

Train Model

monitor.Metrics = ["TrainingLoss" "ValidationAccuracy"]; monitor.XLabel = "Iteration"; monitor.Status = "Training";

for iteration = 1:numIterations [X1,X2,pairLabels] = getTwinBatch(imdsTrain,miniBatchSize);

X1 = dlarray(single(X1),"SSCB");
X2 = dlarray(single(X2),"SSCB");

if (output.executionEnvironment == "auto" && canUseGPU) || ...
        output.executionEnvironment == "gpu"
    X1 = gpuArray(X1);
    X2 = gpuArray(X2);
end

[loss,gradientsSubnet,gradientsParams] = dlfeval(@modelLoss, ...
    net,fcParams,X1,X2,pairLabels);
lossValue = double(gather(extractdata(loss)));

[net,trailingAvgSubnet,trailingAvgSqSubnet] = ...
    adamupdate(net,gradientsSubnet, ...
    trailingAvgSubnet,trailingAvgSqSubnet, ...
    iteration,initialLearnRate,gradientDecayFactor,squaredGradientDecayFactor);

[fcParams,trailingAvgParams,trailingAvgSqParams] = ...
    adamupdate(fcParams,gradientsParams, ...
    trailingAvgParams,trailingAvgSqParams, ...
    iteration,initialLearnRate,gradientDecayFactor,squaredGradientDecayFactor);
 
if ~rem(iteration,validationFrequency) || iteration == 1 || iteration == numIterations
    monitor.Status = "Validating";
    accuracy = zeros(1,5);
    accuracyBatchSize = 150;

    for i = 1:5
        [XAcc1,XAcc2,pairLabelsAcc] = getTwinBatch(imdsTest,accuracyBatchSize);

        XAcc1 = dlarray(single(XAcc1),"SSCB");
        XAcc2 = dlarray(single(XAcc2),"SSCB");

        if (output.executionEnvironment == "auto" && canUseGPU) || ...
                output.executionEnvironment == "gpu"
            XAcc1 = gpuArray(XAcc1);
            XAcc2 = gpuArray(XAcc2);
        end

        Y = predictTwin(net,fcParams,XAcc1,XAcc2);
        Y = round(Y);

        accuracy(i) = sum(Y == pairLabelsAcc)/accuracyBatchSize;
    end

    recordMetrics(monitor,iteration, ...
        ValidationAccuracy=mean(accuracy)*100);
    monitor.Status = "Training";
end
 
output.network = net;
output.weights = fcParams;
recordMetrics(monitor,iteration, ...
    TrainingLoss=lossValue);
monitor.Progress = (iteration/numIterations)*100;

if monitor.Stop
    return;
end

end

Display Pairs of Test Images

testBatchSize = 10;

[XTest1,XTest2,pairLabelsTest] = getTwinBatch(imdsTest,testBatchSize);

XTest1 = dlarray(single(XTest1),"SSCB"); XTest2 = dlarray(single(XTest2),"SSCB");

if (output.executionEnvironment == "auto" && canUseGPU) || ... output.executionEnvironment == "gpu" XTest1 = gpuArray(XTest1); XTest2 = gpuArray(XTest2); end

YScore = predictTwin(net,fcParams,XTest1,XTest2); YScore = gather(extractdata(YScore));

YPred = round(YScore);

XTest1 = extractdata(XTest1); XTest2 = extractdata(XTest2);

figure(Name="Test Images"); title(tiledlayout(2,5), ... "Comparison of Test Images")

for i = 1:numel(pairLabelsTest) if pairLabelsTest(i) == YPred(i) titleStr = "Correct"; titleColor = "#77AC30"; % dark green else titleStr = "Incorrect"; titleColor = "#FF0000"; % red end

if YPred(i) == 1
    predStr = "Predicted: Similar";
else
    predStr = "Predicted: Dissimilar" ;
end

scoreStr = "Score: " + YScore(i);

nexttile        
imshow([XTest1(:,:,:,i) XTest2(:,:,:,i)]);
imageTitle = title(titleStr,[predStr,scoreStr]);
imageTitle.Color = titleColor;
xticks([])
yticks([])

end

Helper Functions

The modelLoss function takes as input the dlnetwork object net, a pair of mini-batch input data X1 and X2, and the label indicating whether they are similar or dissimilar. The function returns the loss with respect to the learnable parameters in the network and the binary cross-entropy loss between the prediction and the ground truth.

function [loss,gradientsSubnet,gradientsParams] = modelLoss(net,fcParams,X1,X2,pairLabels) Y = forwardTwin(net,fcParams,X1,X2); loss = binarycrossentropy(Y,pairLabels); [gradientsSubnet,gradientsParams] = dlgradient(loss,net.Learnables,fcParams); end

The binarycrossentropy function returns the binary cross-entropy loss value for a prediction from the network.

function loss = binarycrossentropy(Y,pairLabels) precision = underlyingType(Y); Y(Y < eps(precision)) = eps(precision); Y(Y > 1 - eps(precision)) = 1 - eps(precision);

loss = -pairLabels.*log(Y) - (1 - pairLabels).*log(1 - Y); loss = sum(loss)/numel(pairLabels); end

The forwardTwin function defines how the subnetworks and the fullyconnect and sigmoid operations combine to form the complete twin neural network. The function accepts the network structure and two training images and returns a prediction of the probability of the pair being similar (closer to 1) or dissimilar (closer to 0).

function Y = forwardTwin(net,fcParams,X1,X2) F1 = forward(net,X1); F1 = sigmoid(F1);

F2 = forward(net,X2); F2 = sigmoid(F2);

Y = abs(F1 - F2); Y = fullyconnect(Y,fcParams.FcWeights,fcParams.FcBias); Y = sigmoid(Y); end

The getTwinBatch function returns a randomly selected batch of paired images. On average, this function produces a balanced set of similar and dissimilar pairs.

function [X1,X2,pairLabels] = getTwinBatch(imds,miniBatchSize) pairLabels = zeros(1,miniBatchSize); imgSize = size(readimage(imds,1)); X1 = zeros([imgSize 1 miniBatchSize]); X2 = zeros([imgSize 1 miniBatchSize]); for i = 1:miniBatchSize choice = rand(1); if choice < 0.5 [pairIdx1,pairIdx2,pairLabels(i)] = getSimilarPair(imds.Labels); else [pairIdx1,pairIdx2,pairLabels(i)] = getDissimilarPair(imds.Labels); end X1(:,:,:,i) = imds.readimage(pairIdx1); X2(:,:,:,i) = imds.readimage(pairIdx2); end end

The getSimilarPair function returns a random pair of indices for images that are in the same class and the similar pair label of 1.

function [pairIdx1,pairIdx2,label] = getSimilarPair(classLabel) classes = unique(classLabel); classChoice = randi(numel(classes)); idxs = find(classLabel==classes(classChoice)); pairIdxChoice = randperm(numel(idxs),2); pairIdx1 = idxs(pairIdxChoice(1)); pairIdx2 = idxs(pairIdxChoice(2)); label = 1; end

The getDissimilarPair function returns a random pair of indices for images that are in different classes and the dissimilar pair label of 0.

function [pairIdx1,pairIdx2,label] = getDissimilarPair(classLabel) classes = unique(classLabel); classesChoice = randperm(numel(classes),2); idxs1 = find(classLabel==classes(classesChoice(1))); idxs2 = find(classLabel==classes(classesChoice(2))); pairIdx1Choice = randi(numel(idxs1)); pairIdx2Choice = randi(numel(idxs2)); pairIdx1 = idxs1(pairIdx1Choice); pairIdx2 = idxs2(pairIdx2Choice); label = 0; end

The predictTwin function uses the trained network to make predictions about the similarity of two images.

function Y = predictTwin(net,fcParams,X1,X2) F1 = predict(net,X1); F1 = sigmoid(F1);

F2 = predict(net,X2); F2 = sigmoid(F2);

Y = abs(F1 - F2); Y = fullyconnect(Y,fcParams.FcWeights,fcParams.FcBias); Y = sigmoid(Y); end

See Also

Apps

Functions

Objects

Topics