splitEachLabel - Split ImageDatastore labels by proportions - MATLAB (original) (raw)
Split ImageDatastore labels by proportions
Syntax
Description
[[imds1,imds2](#bu48dhj-1-imds1imds2)] = splitEachLabel([imds](#bu48dhj-1-imds),[p](#bu48dhj-1-p))
splits the image files in imds
into two new datastores,imds1
and imds2
. The new datastoreimds1
contains the first p
files from each label and imds2
contains the remaining files from each label. p
can be either a number between 0 and 1 indicating the percentage of the files from each label to assign toimds1
, or an integer indicating the absolute number of files from each label to assign to imds1
.
[[imds1,...,imdsM](#bu48dhj-1-imds1imdsM)] = splitEachLabel([imds](#bu48dhj-1-imds),[p1,...,pN](#bu48dhj-1-p1pN))
splits the datastore into N+1
new datastores. The first new datastore imds1
contains the first p1
files from each label, the next new datastore imds2
contains the next p2
files, and so on. If p1,...,pN
represent numbers of files, then their sum must be no more than the number of files in the smallest label in the original datastoreimds
.
___ = splitEachLabel(___,'randomized')
randomly assigns the specified proportion of files from each label to the new datastores.
___ = splitEachLabel(___,[Name,Value](#namevaluepairarguments))
specifies the properties of the new datastores using one or more name-value pair arguments. For example, you can specify which labels to split with'Include','labelname'
.
Examples
Split Labels by Percentage
Create an ImageDatastore
object and label each image according to the name of the folder it is in. The resulting label names are demos
and imagesci
.
imds = imageDatastore(fullfile(matlabroot, 'toolbox', 'matlab', {'demos','imagesci'}),... 'LabelSource', 'foldernames', 'FileExtensions', {'.jpg', '.png', '.tif'});
imds.Labels
ans =
demos
demos
demos
demos
demos
demos
imagesci
imagesci
Create two new datastores from the files in imds
. The first datastore imds60
contains the first 60% of files with the demos
label and the first 60% of files with the imagesci
label. The second datastore imds40
contains the remaining 40% of files from each label. If the percentage applied to a label does not result in a whole number of files, splitEachLabel
rounds down to the nearest whole number.
[imds60,imds40] = splitEachLabel(imds,0.6)
imds60 =
ImageDatastore with properties:
Files: {
' ...\matlab\toolbox\matlab\demos\cloudCombined.jpg';
' ...\matlab\toolbox\matlab\demos\example.tif';
' ...\matlab\toolbox\matlab\demos\landOcean.jpg'
... and 2 more
}
Labels: [demos; demos; demos ... and 2 more categorical]
ReadFcn: @readDatastoreImage
imds40 =
ImageDatastore with properties:
Files: {
' ...\matlab\toolbox\matlab\demos\street1.jpg';
' ...\matlab\toolbox\matlab\demos\street2.jpg';
' ...\matlab\toolbox\matlab\imagesci\peppers.png'
}
Labels: [demos; demos; imagesci]
ReadFcn: @readDatastoreImage
Split Labels by Number of Files
Create an ImageDatastore
object and label each image according to the name of the folder it is in. The resulting label names are demos
and imagesci
.
imds = imageDatastore(fullfile(matlabroot, 'toolbox', 'matlab', {'demos','imagesci'}),... 'LabelSource', 'foldernames', 'FileExtensions', {'.jpg', '.png', '.tif'});
imds.Labels
ans =
demos
demos
demos
demos
demos
demos
imagesci
imagesci
Create two new datastores from the files in imds
. The first datastore imds1
contains the first file with the demos
label and the first file with the imagesci
label. The second datastore imds2
contains the remaining files from each label.
[imds1,imds2] = splitEachLabel(imds,1)
imds1 =
ImageDatastore with properties:
Files: {
' ...\matlab\toolbox\matlab\demos\cloudCombined.jpg';
' ...\matlab\toolbox\matlab\imagesci\corn.tif'
}
Labels: [demos; imagesci]
ReadFcn: @readDatastoreImage
imds2 =
ImageDatastore with properties:
Files: {
' ...\matlab\toolbox\matlab\demos\example.tif';
' ...\matlab\toolbox\matlab\demos\landOcean.jpg';
' ...\matlab\toolbox\matlab\demos\ngc6543a.jpg'
... and 3 more
}
Labels: [demos; demos; demos ... and 3 more categorical]
ReadFcn: @readDatastoreImage
Split Labels Several Ways by Percentage
Create an ImageDatastore
object and label each image according to the name of the folder it is in. The resulting label names are demos
and imagesci
.
imds = imageDatastore(fullfile(matlabroot, 'toolbox', 'matlab', {'demos','imagesci'}),... 'LabelSource', 'foldernames', 'FileExtensions', {'.jpg', '.png', '.tif'});
imds.Labels
ans =
demos
demos
demos
demos
demos
demos
imagesci
imagesci
Create three new datastores from the files in imds
. The first datastore imds60
contains the first 60% of files with the demos
label and the first 60% of files with the imagesci
label. The second datastore imds10
contains the next 10% of files from each label. The third datastore imds30
contains the remaining 30% of files from each label. If the percentage applied to a label does not result in a whole number of files, splitEachLabel
rounds down to the nearest whole number.
[imds60, imds10, imds30] = splitEachLabel(imds,0.6,0.1)
imds60 =
ImageDatastore with properties:
Files: {
' ...\matlab\toolbox\matlab\demos\cloudCombined.jpg';
' ...\matlab\toolbox\matlab\demos\example.tif';
' ...\matlab\toolbox\matlab\demos\landOcean.jpg'
... and 2 more
}
Labels: [demos; demos; demos ... and 2 more categorical]
ReadFcn: @readDatastoreImage
imds10 =
ImageDatastore with properties:
Files: {
' ...\matlab\toolbox\matlab\demos\street1.jpg'
}
Labels: demos
ReadFcn: @readDatastoreImage
imds30 =
ImageDatastore with properties:
Files: {
' ...\matlab\toolbox\matlab\demos\street2.jpg';
' ...\matlab\toolbox\matlab\imagesci\peppers.png'
}
Labels: [demos; imagesci]
ReadFcn: @readDatastoreImage
Split Labels Several Ways by Number of Files
Create an ImageDatastore
object and label each image according to the name of the folder it is in. The resulting label names are demos
and imagesci
.
imds = imageDatastore(fullfile(matlabroot, 'toolbox', 'matlab', {'demos','imagesci'}),... 'LabelSource', 'foldernames', 'FileExtensions', {'.jpg', '.png', '.tif'});
imds.Labels
ans =
demos
demos
demos
demos
demos
demos
imagesci
imagesci
Create three new datastores from the files in imds
. The first datastore imds1
contains the first file with the demos
label and the first file with the imagesci
label. The second datastore imds2
contains the next file from each label. The third datastore imds3
contains the remaining files from each label.
[imds1, imds2, imds3] = splitEachLabel(imds,1,1)
imds1 =
ImageDatastore with properties:
Files: {
' ...\matlab\toolbox\matlab\demos\cloudCombined.jpg';
' ...\matlab\toolbox\matlab\imagesci\corn.tif'
}
Labels: [demos; imagesci]
ReadFcn: @readDatastoreImage
imds2 =
ImageDatastore with properties:
Files: {
' ...\matlab\toolbox\matlab\demos\example.tif';
' ...\matlab\toolbox\matlab\imagesci\peppers.png'
}
Labels: [demos; imagesci]
ReadFcn: @readDatastoreImage
imds3 =
ImageDatastore with properties:
Files: {
' ...\matlab\toolbox\matlab\demos\landOcean.jpg';
' ...\matlab\toolbox\matlab\demos\ngc6543a.jpg';
' ...\matlab\toolbox\matlab\demos\street1.jpg'
... and 1 more
}
Labels: [demos; demos; demos ... and 1 more categorical]
ReadFcn: @readDatastoreImage
Randomly Split Labels
Create an ImageDatastore
object and label each image according to the name of the folder it is in. The resulting label names are demos
and imagesci
.
imds = imageDatastore(fullfile(matlabroot, 'toolbox', 'matlab', {'demos','imagesci'}),... 'LabelSource', 'foldernames', 'FileExtensions', {'.jpg', '.png', '.tif'});
imds.Labels
ans =
demos
demos
demos
demos
demos
demos
imagesci
imagesci
Create two new datastores from the files in imds
by randomly drawing from each label. The first datastore imds1
contains one random file with the demos
label and one random file with the imagesci
label. The second datastore imds2
contains the remaining files from each label.
[imds1, imds2] = splitEachLabel(imds,1,'randomized')
imds1 =
ImageDatastore with properties:
Files: {
' ...\matlab\toolbox\matlab\demos\street2.jpg';
' ...\matlab\toolbox\matlab\imagesci\corn.tif'
}
Labels: [demos; imagesci]
ReadFcn: @readDatastoreImage
imds2 =
ImageDatastore with properties:
Files: {
' ...\matlab\toolbox\matlab\demos\cloudCombined.jpg';
' ...\matlab\toolbox\matlab\demos\example.tif';
' ...\matlab\toolbox\matlab\demos\landOcean.jpg'
... and 3 more
}
Labels: [demos; demos; demos ... and 3 more categorical]
ReadFcn: @readDatastoreImage
Include and Exclude Specified Labels
Create an ImageDatastore
object and label each image according to the name of the folder it is in. The resulting label names are demos
and imagesci
.
imds = imageDatastore(fullfile(matlabroot, 'toolbox', 'matlab', {'demos','imagesci'}),... 'LabelSource', 'foldernames', 'FileExtensions', {'.jpg', '.png', '.tif'});
imds.Labels
ans =
demos
demos
demos
demos
demos
demos
imagesci
imagesci
Create two new datastores from the files in imds
, including only the files with the demos
label. The first datastore imds60
contains the first 60% of files with the demos
label and the second datastore imds40
contains the remaining 40% of files with the demos
label.
[imds60, imds40] = splitEachLabel(imds,0.6,'Include','demos')
imds60 =
ImageDatastore with properties:
Files: {
' ...\matlab\toolbox\matlab\demos\cloudCombined.jpg';
' ...\matlab\toolbox\matlab\demos\example.tif';
' ...\matlab\toolbox\matlab\demos\landOcean.jpg'
... and 1 more
}
Labels: [demos; demos; demos ... and 1 more categorical]
ReadFcn: @readDatastoreImage
imds40 =
ImageDatastore with properties:
Files: {
' ...\matlab\toolbox\matlab\demos\street1.jpg';
' ...\matlab\toolbox\matlab\demos\street2.jpg'
}
Labels: [demos; demos]
ReadFcn: @readDatastoreImage
Equivalently, you can split only the demos
label by excluding the imagesci
label.
[imds60, imds40] = splitEachLabel(imds,0.6,'Exclude','imagesci')
imds60 =
ImageDatastore with properties:
Files: {
' ...\matlab\toolbox\matlab\demos\cloudCombined.jpg';
' ...\matlab\toolbox\matlab\demos\example.tif';
' ...\matlab\toolbox\matlab\demos\landOcean.jpg'
... and 1 more
}
Labels: [demos; demos; demos ... and 1 more categorical]
ReadFcn: @readDatastoreImage
imds40 =
ImageDatastore with properties:
Files: {
' ...\matlab\toolbox\matlab\demos\street1.jpg';
' ...\matlab\toolbox\matlab\demos\street2.jpg'
}
Labels: [demos; demos]
ReadFcn: @readDatastoreImage
Input Arguments
imds
— Input datastore
ImageDatastore
object
Input datastore, specified as an ImageDatastore
object. To create an ImageDatstore
from your image data, use theimageDatastore function.
p
— Proportion of files to split
scalar in interval (0,1) | positive integer scalar
Proportion of files to split, specified as a scalar in the interval (0,1) or a positive integer scalar.
- If
p
is in the interval (0,1), then it represents the percentage of the files from each label to assign toimds1
. Ifp
does not result in a whole number of files, thensplitEachLabel
rounds down to the nearest whole number. - If
p
is an integer, then it represents the absolute number of files from each label to assign toimds1
. There must be at leastp
files associated with each label.
Data Types: double
p1,...,pN
— List of proportions
scalars in interval (0,1) | positive integer scalars
List of proportions, specified as scalars in the interval (0,1) or positive integer scalars. If the proportions are in the interval (0,1), then they represent the percentage of the files from each label to assign to the output datastores. If the proportions are integers, then they indicate the absolute number of files from each label to assign to the output datastores. When the proportions represent percentages, their sum must be no more than 1. When the proportions represent numbers of files, there must be enough files associated with each label to satisfy each proportion.
Data Types: double
Name-Value Arguments
Specify optional pairs of arguments asName1=Value1,...,NameN=ValueN
, where Name
is the argument name and Value
is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.
Before R2021a, use commas to separate each name and value, and enclose Name
in quotes.
Example: [imds1 imds2] = splitEachLabel(imds,0.5,'Exclude','demos')
Include
— Labels to include
categorical, logical, or numeric vector | cell array of character vectors | string array
Labels to include, specified as the comma-separated pair consisting of'Include'
and a vector, cell array, or string array of label names with the same type as the Labels
property. Each name must match one of the labels in theLabels
property of the datastore.
Data Types: char
| cell
| string
Exclude
— Labels to exclude
categorical, logical, or numeric vector | cell array of character vectors | string array
Labels to exclude, specified as the comma-separated pair consisting of'Exclude'
and a vector, cell array, or string array of label names with the same type as the Labels
property. Each name defines a label associated with the datastore and must match the names in Labels
. This option cannot be used with the 'Include'
option.
Data Types: char
| cell
| string
Output Arguments
imds1,imds2
— Output datastores
ImageDatastore
objects
Output datastores, returned as ImageDatastore
objects.imds1
contains the specified proportion of files from each label in imds
, and imds2
contains the remaining files.
imds1,...,imdsM
— List of output datastores
ImageDatastore
objects
List of output datastores, returned as ImageDatastore
objects. The number of elements in the list is one more than the number of listed proportions. Each of the new datastores contains the proportion of each label in imds
defined byp1,...,pN
. Any files left over are assigned to the Mth datastore.
Extended Capabilities
Thread-Based Environment
Run code in the background using MATLAB® backgroundPool
or accelerate code with Parallel Computing Toolbox™ ThreadPool
.
This function fully supports thread-based environments. For more information, see Run MATLAB Functions in Thread-Based Environment.
Version History
Introduced in R2016a