splitEachLabel - Split ImageDatastore labels by proportions - MATLAB (original) (raw)

Split ImageDatastore labels by proportions

Syntax

Description

[[imds1,imds2](#bu48dhj-1-imds1imds2)] = splitEachLabel([imds](#bu48dhj-1-imds),[p](#bu48dhj-1-p)) splits the image files in imds into two new datastores,imds1 and imds2. The new datastoreimds1 contains the first p files from each label and imds2 contains the remaining files from each label. p can be either a number between 0 and 1 indicating the percentage of the files from each label to assign toimds1, or an integer indicating the absolute number of files from each label to assign to imds1.

example

[[imds1,...,imdsM](#bu48dhj-1-imds1imdsM)] = splitEachLabel([imds](#bu48dhj-1-imds),[p1,...,pN](#bu48dhj-1-p1pN)) splits the datastore into N+1 new datastores. The first new datastore imds1 contains the first p1 files from each label, the next new datastore imds2 contains the next p2 files, and so on. If p1,...,pN represent numbers of files, then their sum must be no more than the number of files in the smallest label in the original datastoreimds.

example

___ = splitEachLabel(___,'randomized') randomly assigns the specified proportion of files from each label to the new datastores.

example

___ = splitEachLabel(___,[Name,Value](#namevaluepairarguments)) specifies the properties of the new datastores using one or more name-value pair arguments. For example, you can specify which labels to split with'Include','labelname'.

example

Examples

collapse all

Split Labels by Percentage

Create an ImageDatastore object and label each image according to the name of the folder it is in. The resulting label names are demos and imagesci.

imds = imageDatastore(fullfile(matlabroot, 'toolbox', 'matlab', {'demos','imagesci'}),... 'LabelSource', 'foldernames', 'FileExtensions', {'.jpg', '.png', '.tif'});

imds.Labels

ans =

 demos 
 demos 
 demos 
 demos 
 demos 
 demos 
 imagesci 
 imagesci 

Create two new datastores from the files in imds. The first datastore imds60 contains the first 60% of files with the demos label and the first 60% of files with the imagesci label. The second datastore imds40 contains the remaining 40% of files from each label. If the percentage applied to a label does not result in a whole number of files, splitEachLabel rounds down to the nearest whole number.

[imds60,imds40] = splitEachLabel(imds,0.6)

imds60 =

ImageDatastore with properties:

  Files: {
         ' ...\matlab\toolbox\matlab\demos\cloudCombined.jpg';
         ' ...\matlab\toolbox\matlab\demos\example.tif';
         ' ...\matlab\toolbox\matlab\demos\landOcean.jpg'
          ... and 2 more
         }
 Labels: [demos; demos; demos ... and 2 more categorical]
ReadFcn: @readDatastoreImage

imds40 =

ImageDatastore with properties:

  Files: {
         ' ...\matlab\toolbox\matlab\demos\street1.jpg';
         ' ...\matlab\toolbox\matlab\demos\street2.jpg';
         ' ...\matlab\toolbox\matlab\imagesci\peppers.png'
         }
 Labels: [demos; demos; imagesci]
ReadFcn: @readDatastoreImage

Split Labels by Number of Files

Create an ImageDatastore object and label each image according to the name of the folder it is in. The resulting label names are demos and imagesci.

imds = imageDatastore(fullfile(matlabroot, 'toolbox', 'matlab', {'demos','imagesci'}),... 'LabelSource', 'foldernames', 'FileExtensions', {'.jpg', '.png', '.tif'});

imds.Labels

ans =

 demos 
 demos 
 demos 
 demos 
 demos 
 demos 
 imagesci 
 imagesci 

Create two new datastores from the files in imds. The first datastore imds1 contains the first file with the demos label and the first file with the imagesci label. The second datastore imds2 contains the remaining files from each label.

[imds1,imds2] = splitEachLabel(imds,1)

imds1 =

ImageDatastore with properties:

  Files: {
         ' ...\matlab\toolbox\matlab\demos\cloudCombined.jpg';
         ' ...\matlab\toolbox\matlab\imagesci\corn.tif'
         }
 Labels: [demos; imagesci]
ReadFcn: @readDatastoreImage

imds2 =

ImageDatastore with properties:

  Files: {
         ' ...\matlab\toolbox\matlab\demos\example.tif';
         ' ...\matlab\toolbox\matlab\demos\landOcean.jpg';
         ' ...\matlab\toolbox\matlab\demos\ngc6543a.jpg'
          ... and 3 more
         }
 Labels: [demos; demos; demos ... and 3 more categorical]
ReadFcn: @readDatastoreImage

Split Labels Several Ways by Percentage

Create an ImageDatastore object and label each image according to the name of the folder it is in. The resulting label names are demos and imagesci.

imds = imageDatastore(fullfile(matlabroot, 'toolbox', 'matlab', {'demos','imagesci'}),... 'LabelSource', 'foldernames', 'FileExtensions', {'.jpg', '.png', '.tif'});

imds.Labels

ans =

 demos 
 demos 
 demos 
 demos 
 demos 
 demos 
 imagesci 
 imagesci 

Create three new datastores from the files in imds. The first datastore imds60 contains the first 60% of files with the demos label and the first 60% of files with the imagesci label. The second datastore imds10 contains the next 10% of files from each label. The third datastore imds30 contains the remaining 30% of files from each label. If the percentage applied to a label does not result in a whole number of files, splitEachLabel rounds down to the nearest whole number.

[imds60, imds10, imds30] = splitEachLabel(imds,0.6,0.1)

imds60 =

ImageDatastore with properties:

  Files: {
         ' ...\matlab\toolbox\matlab\demos\cloudCombined.jpg';
         ' ...\matlab\toolbox\matlab\demos\example.tif';
         ' ...\matlab\toolbox\matlab\demos\landOcean.jpg'
          ... and 2 more
         }
 Labels: [demos; demos; demos ... and 2 more categorical]
ReadFcn: @readDatastoreImage

imds10 =

ImageDatastore with properties:

  Files: {
         ' ...\matlab\toolbox\matlab\demos\street1.jpg'
         }
 Labels: demos
ReadFcn: @readDatastoreImage

imds30 =

ImageDatastore with properties:

  Files: {
         ' ...\matlab\toolbox\matlab\demos\street2.jpg';
         ' ...\matlab\toolbox\matlab\imagesci\peppers.png'
         }
 Labels: [demos; imagesci]
ReadFcn: @readDatastoreImage

Split Labels Several Ways by Number of Files

Create an ImageDatastore object and label each image according to the name of the folder it is in. The resulting label names are demos and imagesci.

imds = imageDatastore(fullfile(matlabroot, 'toolbox', 'matlab', {'demos','imagesci'}),... 'LabelSource', 'foldernames', 'FileExtensions', {'.jpg', '.png', '.tif'});

imds.Labels

ans =

 demos 
 demos 
 demos 
 demos 
 demos 
 demos 
 imagesci 
 imagesci 

Create three new datastores from the files in imds. The first datastore imds1 contains the first file with the demos label and the first file with the imagesci label. The second datastore imds2 contains the next file from each label. The third datastore imds3 contains the remaining files from each label.

[imds1, imds2, imds3] = splitEachLabel(imds,1,1)

imds1 =

ImageDatastore with properties:

  Files: {
         ' ...\matlab\toolbox\matlab\demos\cloudCombined.jpg';
         ' ...\matlab\toolbox\matlab\imagesci\corn.tif'
         }
 Labels: [demos; imagesci]
ReadFcn: @readDatastoreImage

imds2 =

ImageDatastore with properties:

  Files: {
         ' ...\matlab\toolbox\matlab\demos\example.tif';
         ' ...\matlab\toolbox\matlab\imagesci\peppers.png'
         }
 Labels: [demos; imagesci]
ReadFcn: @readDatastoreImage

imds3 =

ImageDatastore with properties:

  Files: {
         ' ...\matlab\toolbox\matlab\demos\landOcean.jpg';
         ' ...\matlab\toolbox\matlab\demos\ngc6543a.jpg';
         ' ...\matlab\toolbox\matlab\demos\street1.jpg'
          ... and 1 more
         }
 Labels: [demos; demos; demos ... and 1 more categorical]
ReadFcn: @readDatastoreImage

Randomly Split Labels

Create an ImageDatastore object and label each image according to the name of the folder it is in. The resulting label names are demos and imagesci.

imds = imageDatastore(fullfile(matlabroot, 'toolbox', 'matlab', {'demos','imagesci'}),... 'LabelSource', 'foldernames', 'FileExtensions', {'.jpg', '.png', '.tif'});

imds.Labels

ans =

 demos 
 demos 
 demos 
 demos 
 demos 
 demos 
 imagesci 
 imagesci 

Create two new datastores from the files in imds by randomly drawing from each label. The first datastore imds1 contains one random file with the demos label and one random file with the imagesci label. The second datastore imds2 contains the remaining files from each label.

[imds1, imds2] = splitEachLabel(imds,1,'randomized')

imds1 =

ImageDatastore with properties:

  Files: {
         ' ...\matlab\toolbox\matlab\demos\street2.jpg';
         ' ...\matlab\toolbox\matlab\imagesci\corn.tif'
         }
 Labels: [demos; imagesci]
ReadFcn: @readDatastoreImage

imds2 =

ImageDatastore with properties:

  Files: {
         ' ...\matlab\toolbox\matlab\demos\cloudCombined.jpg';
         ' ...\matlab\toolbox\matlab\demos\example.tif';
         ' ...\matlab\toolbox\matlab\demos\landOcean.jpg'
          ... and 3 more
         }
 Labels: [demos; demos; demos ... and 3 more categorical]
ReadFcn: @readDatastoreImage

Include and Exclude Specified Labels

Create an ImageDatastore object and label each image according to the name of the folder it is in. The resulting label names are demos and imagesci.

imds = imageDatastore(fullfile(matlabroot, 'toolbox', 'matlab', {'demos','imagesci'}),... 'LabelSource', 'foldernames', 'FileExtensions', {'.jpg', '.png', '.tif'});

imds.Labels

ans =

 demos 
 demos 
 demos 
 demos 
 demos 
 demos 
 imagesci 
 imagesci 

Create two new datastores from the files in imds, including only the files with the demos label. The first datastore imds60 contains the first 60% of files with the demos label and the second datastore imds40 contains the remaining 40% of files with the demos label.

[imds60, imds40] = splitEachLabel(imds,0.6,'Include','demos')

imds60 =

ImageDatastore with properties:

  Files: {
         ' ...\matlab\toolbox\matlab\demos\cloudCombined.jpg';
         ' ...\matlab\toolbox\matlab\demos\example.tif';
         ' ...\matlab\toolbox\matlab\demos\landOcean.jpg'
          ... and 1 more
         }
 Labels: [demos; demos; demos ... and 1 more categorical]
ReadFcn: @readDatastoreImage

imds40 =

ImageDatastore with properties:

  Files: {
         ' ...\matlab\toolbox\matlab\demos\street1.jpg';
         ' ...\matlab\toolbox\matlab\demos\street2.jpg'
         }
 Labels: [demos; demos]
ReadFcn: @readDatastoreImage

Equivalently, you can split only the demos label by excluding the imagesci label.

[imds60, imds40] = splitEachLabel(imds,0.6,'Exclude','imagesci')

imds60 =

ImageDatastore with properties:

  Files: {
         ' ...\matlab\toolbox\matlab\demos\cloudCombined.jpg';
         ' ...\matlab\toolbox\matlab\demos\example.tif';
         ' ...\matlab\toolbox\matlab\demos\landOcean.jpg'
          ... and 1 more
         }
 Labels: [demos; demos; demos ... and 1 more categorical]
ReadFcn: @readDatastoreImage

imds40 =

ImageDatastore with properties:

  Files: {
         ' ...\matlab\toolbox\matlab\demos\street1.jpg';
         ' ...\matlab\toolbox\matlab\demos\street2.jpg'
         }
 Labels: [demos; demos]
ReadFcn: @readDatastoreImage

Input Arguments

collapse all

imds — Input datastore

ImageDatastore object

Input datastore, specified as an ImageDatastore object. To create an ImageDatstore from your image data, use theimageDatastore function.

p — Proportion of files to split

scalar in interval (0,1) | positive integer scalar

Proportion of files to split, specified as a scalar in the interval (0,1) or a positive integer scalar.

Data Types: double

p1,...,pN — List of proportions

scalars in interval (0,1) | positive integer scalars

List of proportions, specified as scalars in the interval (0,1) or positive integer scalars. If the proportions are in the interval (0,1), then they represent the percentage of the files from each label to assign to the output datastores. If the proportions are integers, then they indicate the absolute number of files from each label to assign to the output datastores. When the proportions represent percentages, their sum must be no more than 1. When the proportions represent numbers of files, there must be enough files associated with each label to satisfy each proportion.

Data Types: double

Name-Value Arguments

Specify optional pairs of arguments asName1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Before R2021a, use commas to separate each name and value, and enclose Name in quotes.

Example: [imds1 imds2] = splitEachLabel(imds,0.5,'Exclude','demos')

Include — Labels to include

categorical, logical, or numeric vector | cell array of character vectors | string array

Labels to include, specified as the comma-separated pair consisting of'Include' and a vector, cell array, or string array of label names with the same type as the Labels property. Each name must match one of the labels in theLabels property of the datastore.

Data Types: char | cell | string

Exclude — Labels to exclude

categorical, logical, or numeric vector | cell array of character vectors | string array

Labels to exclude, specified as the comma-separated pair consisting of'Exclude' and a vector, cell array, or string array of label names with the same type as the Labels property. Each name defines a label associated with the datastore and must match the names in Labels. This option cannot be used with the 'Include' option.

Data Types: char | cell | string

Output Arguments

collapse all

imds1,imds2 — Output datastores

ImageDatastore objects

Output datastores, returned as ImageDatastore objects.imds1 contains the specified proportion of files from each label in imds, and imds2 contains the remaining files.

imds1,...,imdsM — List of output datastores

ImageDatastore objects

List of output datastores, returned as ImageDatastore objects. The number of elements in the list is one more than the number of listed proportions. Each of the new datastores contains the proportion of each label in imds defined byp1,...,pN. Any files left over are assigned to the Mth datastore.

Extended Capabilities

Thread-Based Environment

Run code in the background using MATLAB® backgroundPool or accelerate code with Parallel Computing Toolbox™ ThreadPool.

This function fully supports thread-based environments. For more information, see Run MATLAB Functions in Thread-Based Environment.

Version History

Introduced in R2016a