matlab.io.datastore.Subsettable.subsetByReadIndices - Create subset of datastore or file-set with the specified read indices - MATLAB (original) (raw)
Class: matlab.io.datastore.Subsettable
Namespace: matlab.io.datastore
Create subset of datastore or file-set with the specified read indices
Since R2022b
Syntax
subds = subsetByReadIndices(ds,indices)
Description
subds = subsetByReadIndices([ds](#mw%5F3ffa65f5-bc25-403f-acc1-111b1246631a),[indices](#mw%5F2679d9d6-baf4-4b47-8b6c-37d696d414a0))
creates a subset of the specified datastore or file-set using the specified read indices. The subset subds
is of the same type as the input.
Input Arguments
Indices of files to include in the subset, specified as a numeric vector of indices or a logical vector. The subsetByReadIndices
method creates a subsetsubds
containing files corresponding to the elements in the logical vector that have a value of true
.
- numeric vector: Vector containing unique indices of files in the input datastore.
- logical vector: Vector the same length as the number of files in the input datastore.
Attributes
Abstract | true |
---|---|
Access | protected |
To learn about attributes of methods, seeMethod Attributes.
Examples
Build a datastore with subset processing support and use it to bring your data into MATLAB®.
Create a class definition file that contains the code implementing your datastore. Save this file in your working folder or in a folder that is on the MATLAB path. The name of the .m
file must be the same as the name of your object constructor function. In this example, create the MyHDF5Datastore
class in a file named MyHDF5Datastore.m
. The .m
class definition contains the following steps:
- Step 1: Inherit from the
matlab.io.Datastore
andmatlab.io.datastore.Subsettable
classes. - Step 2: Define the constructor as well as the
subsetByReadIndices
andmaxpartitions
methods. - Step 3: Define your custom file-reading function. Here, the
MyHDF5Datastore
class creates and uses thelistHDF5Datasets
function.
%% STEP 1 classdef MyHDF5Datastore < matlab.io.Datastore ... & matlab.io.datastore.Subsettable
properties
Filename (1, 1) string
Datasets (:, 1) string {mustBeNonmissing} = "/"
CurrentDatasetIndex (1, 1) double {mustBeInteger, mustBeNonnegative} = 1
end
%% STEP 2 methods function ds = MyHDF5Datastore(Filename, Location) arguments Filename (1, 1) string Location (1, 1) string {mustBeNonmissing} = "/" end
ds.Filename = Filename;
ds.Datasets = listHDF5Datasets(ds.Filename, Location);
end
function [data, info] = read(ds, varargin)
if ~hasdata(ds)
error(message("No more datasets to read."));
end
dataset = ds.Datasets(ds.CurrentDatasetIndex);
data = { h5read(ds.Filename, dataset, varargin{:}) };
if nargout > 1
info = h5info(ds.Filename, dataset);
end
ds.CurrentDatasetIndex = ds.CurrentDatasetIndex + 1;
end
function tf = hasdata(ds)
tf = ds.CurrentDatasetIndex <= numel(ds.Datasets);
end
function reset(ds)
ds.CurrentDatasetIndex = 1;
end
end
methods (Access = protected)
function subds = subsetByReadIndices(ds, indices)
datasets = ds.Datasets(indices);
subds = copy(ds);
subds.Datasets = datasets;
reset(subds);
end
function n = maxpartitions(ds)
n = numel(ds.Datasets);
end
end
end
%% STEP 3 function datasets = listHDF5Datasets(filename, location, args) arguments filename (1, 1) string location (1, 1) string args.IncludeSubGroups (1, 1) logical = true end
if strlength(location) == 0
location = "/";
end
info = h5info(filename, location);
datasets = listDatasetsInH5infoStruct(info, location, IncludeSubGroups=args.IncludeSubGroups);
end
function datasets = listDatasetsInH5infoStruct(S, location, args) arguments S (1, 1) struct location (1, 1) string args.IncludeSubGroups (1, 1) logical = true end
datasets = string.empty(0, 1);
if isfield(S, "Datatype")
datasets = location;
elseif isfield(S, "Datasets")
if ~isempty(S.Datasets)
datasets = location + "/" + {S.Datasets.Name}';
end
if args.IncludeSubGroups
listFcn = @(group) listDatasetsInH5infoStruct(group, group.Name, IncludeSubGroups=true);
else
listFcn = @(group) string(group.Name);
end
childDatasets = arrayfun(listFcn, S.Groups, UniformOutput=false);
childDatasets = vertcat(childDatasets{:});
datasets = [datasets; childDatasets];
end
end
Extended Capabilities
Usage notes and limitations:
- In a thread-based environment, you can use
subsetByReadIndices
only with the following datastores:ImageDatastore
objectsCombinedDatastore
,SequentialDatastore
, orTransformedDatastore
objects you create fromImageDatastore
objects by usingcombine
ortransform
You can usesubsetByReadIndices
with other datastores if you have Parallel Computing Toolbox™. To do so, run the function using a process-backed parallel pool instead of usingbackgroundPool
orThreadPool
(use eitherProcessPool
orClusterPool
).
For more information, see Run MATLAB Functions in Thread-Based Environment.
Version History
Introduced in R2022b