matlab.io.datastore.Subsettable.subsetByReadIndices - Create subset of datastore or file-set with the specified read indices - MATLAB (original) (raw)

Class: matlab.io.datastore.Subsettable
Namespace: matlab.io.datastore

Create subset of datastore or file-set with the specified read indices

Since R2022b

Syntax

subds = subsetByReadIndices(ds,indices)

Description

subds = subsetByReadIndices([ds](#mw%5F3ffa65f5-bc25-403f-acc1-111b1246631a),[indices](#mw%5F2679d9d6-baf4-4b47-8b6c-37d696d414a0)) creates a subset of the specified datastore or file-set using the specified read indices. The subset subds is of the same type as the input.

Input Arguments

expand all

Indices of files to include in the subset, specified as a numeric vector of indices or a logical vector. The subsetByReadIndices method creates a subsetsubds containing files corresponding to the elements in the logical vector that have a value of true.

Attributes

Abstract true
Access protected

To learn about attributes of methods, seeMethod Attributes.

Examples

expand all

Build a datastore with subset processing support and use it to bring your data into MATLAB®.

Create a class definition file that contains the code implementing your datastore. Save this file in your working folder or in a folder that is on the MATLAB path. The name of the .m file must be the same as the name of your object constructor function. In this example, create the MyHDF5Datastore class in a file named MyHDF5Datastore.m. The .m class definition contains the following steps:

%% STEP 1 classdef MyHDF5Datastore < matlab.io.Datastore ... & matlab.io.datastore.Subsettable

properties
    Filename            (1, 1) string
    Datasets            (:, 1) string {mustBeNonmissing} = "/"
    CurrentDatasetIndex (1, 1) double {mustBeInteger, mustBeNonnegative} = 1
end

%% STEP 2 methods function ds = MyHDF5Datastore(Filename, Location) arguments Filename (1, 1) string Location (1, 1) string {mustBeNonmissing} = "/" end

        ds.Filename = Filename;
        ds.Datasets = listHDF5Datasets(ds.Filename, Location);
    end

    function [data, info] = read(ds, varargin)
        if ~hasdata(ds)
            error(message("No more datasets to read."));
        end

        dataset = ds.Datasets(ds.CurrentDatasetIndex);
        data = { h5read(ds.Filename, dataset, varargin{:}) };
        if nargout > 1
            info =   h5info(ds.Filename, dataset);
        end

        ds.CurrentDatasetIndex = ds.CurrentDatasetIndex + 1;
    end

    function tf = hasdata(ds)
        tf = ds.CurrentDatasetIndex <= numel(ds.Datasets);
    end

    function reset(ds)
        ds.CurrentDatasetIndex = 1;
    end
end

methods (Access = protected)
    function subds = subsetByReadIndices(ds, indices)
        datasets = ds.Datasets(indices);

        subds = copy(ds);
        subds.Datasets = datasets;
        reset(subds);
    end

    function n = maxpartitions(ds)
        n = numel(ds.Datasets);
    end
end

end

%% STEP 3 function datasets = listHDF5Datasets(filename, location, args) arguments filename (1, 1) string location (1, 1) string args.IncludeSubGroups (1, 1) logical = true end

if strlength(location) == 0
    location = "/";
end

info = h5info(filename, location);

datasets = listDatasetsInH5infoStruct(info, location, IncludeSubGroups=args.IncludeSubGroups);

end

function datasets = listDatasetsInH5infoStruct(S, location, args) arguments S (1, 1) struct location (1, 1) string args.IncludeSubGroups (1, 1) logical = true end

datasets = string.empty(0, 1);

if isfield(S, "Datatype")
    datasets = location;
elseif isfield(S, "Datasets")
    if ~isempty(S.Datasets)
        datasets = location + "/" + {S.Datasets.Name}';
    end

    if args.IncludeSubGroups
        listFcn = @(group) listDatasetsInH5infoStruct(group, group.Name, IncludeSubGroups=true);
    else
        listFcn = @(group) string(group.Name);
    end

    childDatasets = arrayfun(listFcn, S.Groups, UniformOutput=false);
    childDatasets = vertcat(childDatasets{:});

    datasets = [datasets; childDatasets];
end

end

Extended Capabilities

expand all

Usage notes and limitations:

For more information, see Run MATLAB Functions in Thread-Based Environment.

Version History

Introduced in R2022b