writeall - Write datastore to files - MATLAB (original) (raw)
Write datastore to files
Since R2020a
Syntax
Description
writeall([ds](#mw%5Fe7ad3e39-8519-4c60-b7e1-04ca111bee47),[outputLocation](#mw%5F48902239-bbe1-48cc-969c-eafede35ab1b))
writes the data from the input datastore ds
to output files at the location specified in outputLocation
. The number of output files is the same as the number of files referenced by the datastore.
writeall([ds](#mw%5Fe7ad3e39-8519-4c60-b7e1-04ca111bee47),[outputLocation](#mw%5F48902239-bbe1-48cc-969c-eafede35ab1b),[Name,Value](#namevaluepairarguments))
writes data with additional options specified by one or more name-value pair arguments. For example, you can specify 'OutputFormat'
and a file extension such as'csv'
to specify the type of files that writeall
creates.
Examples
Write and Reconstruct Datastore
Write a datastore to disk, and then create a new datastore for the written files. This process is useful for saving your work or sharing a datastore with a colleague.
Create a datastore for the airlinesmall.csv
data set, treating'NA'
values as missing data. Select only theYear
, Month
, and Carrier
variables.
ds = datastore('airlinesmall.csv'); ds.TreatAsMissing = 'NA'; ds.SelectedVariableNames = {'Month','Year','UniqueCarrier'};
Preview the datastore.
ans=8×3 table Month Year UniqueCarrier _____ ____ _____________
10 1987 {'PS'}
10 1987 {'PS'}
10 1987 {'PS'}
10 1987 {'PS'}
10 1987 {'PS'}
10 1987 {'PS'}
10 1987 {'PS'}
10 1987 {'PS'}
Save this datastore to a new folder named ExampleData
on theC:\
disk. (You can specify a different write location, especially if you are not using a Windows® computer.) Specify 'FolderLayout'
as'flatten'
to put the datastore files directly in the target folder, without creating subfolders.
location = 'C:\ExampleData'; writeall(ds,location,'Folderlayout','flatten')
Clear ds
from your working directory. To recover the datastore from disk, create a new datastore that references the same folder where the data was written.
clear ds ds2 = datastore(location);
Duplicate or Flatten Folder Layout at Output Location
When writing a datastore to files, you can choose to duplicate or flatten the folder structure by setting the 'FolderLayout'
parameter. Create an image datastore from an input location and then write the datastore out to files. Next, write the files at the output location by duplicating the folder layout. Finally, write the files as a flat list to the output location.
Create an image datastore from the location 'C:\input\'
, which contains some images of cars and trains:
imds = imageDatastore('C:\input','IncludeSubfolders',true);
This folder structure is just an example. You can try the functionality with a folder structure on your machine that contains image files.
Examine the Folders
property of the input datastore, which contains the fully qualified path to the input dataset.
ans =
1×1 cell array
{'C:\input\'}
Write the datastore to the output location 'C:\output\'
. The default value of the 'FolderLayout'
parameter is'duplicate'
. Therefore, the writeall
function duplicates all the folders contained in the Folders
property of the input datastore and then writes the files to the corresponding output folders.
writeall(imds,'C:\output');
Now, write the files from the datastore to output location'C:\output\'
as a flat list by setting'FolderLayout'
to 'flatten'
. In this instance, the writeall
function does not preserve the input folder layout and instead writes all the files to a flat output location.
writeall(imds,'C:\output','FolderLayout','flatten');
For more information, see FolderLayout.
Add Prefix or Suffix to Output File Names
You can use the FilenamePrefix
andFilenameSuffix
parameters to add text to the beginning or end of the output file names. Create an image datastore from an input location. Write the datastore out to files and append text to the beginning of the file names. Then write the datastore again and append text to the end of the file names.
Create an image datastore from the location 'C:\input\'
, which contains some images of cars and trains:
imds = imageDatastore('C:\input','IncludeSubfolders',true);
This folder structure is just an example. You can try the functionality with a folder structure on your machine that contains image files.
Now, write the files to an output location. Specify the value ofFilenamePrefix
as 'png_'
to append that text to the beginning of each output file name.
writeall(imds,'C:\output','FilenamePrefix','png_');
Write the files to a different output location. Specify the value ofFilenameSuffix
as '_image'
to append that text to the end of each output file name.
writeall(imds,'C:\output2','FilenameSuffix','_image');
Input Arguments
outputLocation
— Folder location to write data
character vector | string scalar
Folder location to write data, specified as a character vector or string scalar.outputLocation
can specify a full or relative path.
Example: outputLocation = '../../dir/data'
Example: outputLocation = "C:\Users\MyName\Desktop"
Data Types: char
| string
Name-Value Arguments
Specify optional pairs of arguments asName1=Value1,...,NameN=ValueN
, where Name
is the argument name and Value
is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.
Before R2021a, use commas to separate each name and value, and enclose Name
in quotes.
Example: writeall(ds,outputLocation,'FolderLayout','flatten')
General Options
OutputFormat
— Output file format
character vector | string scalar
Output file format, specified as the comma-separated pair consisting of'OutputFormat'
and a character vector or string scalar.
Based on the type of the input datastore ds
, select one of these values for the output files. You can select any tabular output format for any of the tabular datastores listed below.
Input Datastore | Output Format |
---|---|
TabulartextDatastore,SpreadsheetDatastore,ParquetDatastore | 'txt', 'csv','xlsx', 'xls','parquet', 'parq','dat', or 'asc' |
ImageDatastore | 'png', 'jpg','jpeg', 'tif', or'tiff' |
AudioDatastore | 'wav', 'ogg','flac', 'mp4', or'm4a' |
FileDatastore,TransformedDatastore,CombinedDatastore,SequentialDatastore | All output formats are supported. |
Data Types: char
| string
FolderLayout
— Layout of files in output folder
'duplicate'
(default) | 'flatten'
Layout of files in output folder, specified as the comma-separated pair consisting of 'FolderLayout'
and either 'duplicate'
or'flatten'
.
'duplicate'
— Thewriteall
function replicates the folder structure contained within the input data location to the specified output location. The input data location contains theFolders
property of the input datastore.'flatten'
— Thewriteall
function writes all the files from the input to the specified output folder without any subfolders.
Data Types: char
| string
FilenamePrefix
— Prefix to file name
character vector | string scalar
Prefix to file name, specified as the comma-separated pair consisting of'FilenamePrefix'
and a character vector or string scalar.
The writeall
function adds the specified prefix to the output file names. For example, this code adds today’s date to the beginning of all output file names from the datastore:
prefixText = string(datetime('today')) writeall(imds,'C:\myFolder','FilenamePrefix',prefixText);
Data Types: char
| string
FilenameSuffix
— Suffix to file name
character vector | string scalar
Suffix to file name, specified as the comma-separated pair consisting of'FilenameSuffix'
and a character vector or string scalar.
The writeall
function adds the specified suffix to the output file names. For example, this code adds the descriptive text'jpeg_70per'
to the end of all output file names from the datastore:
writeall(imds,'C:\myFolder','FilenameSuffix','jpeg_70per');
Data Types: char
| string
UseParallel
— Indicator to write in parallel
false
(default) | true
| 0
| 1
Indicator to write in parallel, specified as the comma-separated pair consisting of 'UseParallel'
and either false
ortrue
(0
or 1
).
By default writeall
writes in serial. If you setUseParallel
to true
, thenwriteall
divides the writing operations into separate groups and runs the groups in parallel if:
- Parallel Computing Toolbox™ is installed.
- An open parallel pool exists or automatic pool creation is enabled in the Parallel Preferences.
Otherwise, writeall
writes in serial regardless of the value for UseParallel
.
writeall
does not support writing in parallel via local threads.
Note
Parallel writing is not supported for CombinedDatastore
objects, or datastores resulting from transform
applied to aCombinedDatastore
.
Data Types: logical
WriteFcn
— Custom writing function
function handle
Custom writing function, specified as the comma-separated pair consisting of'WriteFcn'
and a function handle. The specified function is responsible for creating the output files.
You can use the 'WriteFcn'
name-value pair argument to write data in a variety of formats, even if writeall
does not directly support the output format.
Functional Signature
The custom writing function must accept at least three input arguments,data
, writeInfo
, andoutputFormat
:
function myWriteFcn(data,writeInfo,outputFormat)
The function can also accept additional inputs, such as name-value pairs, after the first three required inputs.
data
contains the output of theread
method operating on the datastore.writeInfo
is an object of typematlab.io.datastore.WriteInfo
with fields listed in the table.Field Description Type ReadInfo The second output of the read method struct SuggestedOutputName A fully qualified, globally unique file name that meets the location and naming requirements string Location The location argument passed to datastore write string outputFormat
is the output format to be written to, for example"txt"
or"csv"
.
Example Function
A writing function that writes parquet files from a datastore is:
function myParquetWriteFcn(data, writeInfo, outputFormat) if strcmp(outputFormat,"parquet") parquetwrite(writeInfo.SuggestedOutputName,data) end end
To invoke myParquetWriteFcn
as the writing function for a datastoreds
, use these commands:
ds = parquetDatastore(location); outputLocation = 'C:/tmp/MyData'; writeall(ds,outputLocation,'WriteFcn',@myParquetWriteFcn);
Data Types: function_handle
Options for jpeg
or jpg
Output
Quality
— Quality of JPEG-compressed file
75
(default) | scalar in the range [0,100]
Quality of the JPEG-compressed file, specified as the comma-separated pair consisting of 'Quality'
and a scalar in the range [0,100], where 0 is lower quality and higher compression, and 100 is higher quality and lower compression.
Example: 'Quality',25
Options for wav
or flac
Output
BitsPerSample
— Number of output bits per sample
16
(default) | 8
| 24
| 32
| 64
Number of output bits per sample, specified as the comma-separated pair consisting of 'BitsPerSample'
and 8
, 16
,24
, 32
, or 64
.
For FLAC files, only 8
, 16
, or24
bits per sample are supported.
Example: 'BitsPerSample',32
Options for mp4
or m4a
Output
BitRate
— Kilobits per second (kbit/s)
128 (default) | 64 | 96 | 160 | 192 | 256 | 320
Number of kilobits per second (kbit/s) used for compressed audio files, specified as the comma-separated pair consisting of 'BitRate'
and64
, 96
, 128
,160
, 192
, 256
, or320
. On Windows® 7 or later, the only valid values are: 96
,128
, 160
, and 192
.
In general, a larger BitRate
value results in higher compression quality.
Example: 'BitRate',96
Options for txt
or csv
Output
WriteVariableNames
— Indicator for writing variable names as column headings
true
(default) | false
Indicator for writing variable names as column headings, specified as the comma-separated pair consisting of 'WriteVariableNames'
and either true
or false
.
Indicator | Behavior |
---|---|
true | The writing function includes variable names as the column headings of the output. This is the default behavior. |
false | The writing function does not include variable names in the output. |
Delimiter
— Field delimiter character
character vector | string scalar
Field delimiter character, specified as the comma-separated pair consisting of'Delimiter'
and a character vector or string scalar containing one of these specifiers:
Specifier | Field Delimiter |
---|---|
',''comma' | Comma. This is the default behavior. |
' ''space' | Space |
'\t''tab' | Tab |
';''semi' | Semicolon |
'|''bar' | Vertical bar |
You can use the 'Delimiter'
name-value pair only for delimited text files.
Example: 'Delimiter','space'
Data Types: char
| string
Encoding
— Character encoding scheme
'system'
(default) | 'UTF-8'
| 'ISO-8859-1'
| 'windows-1251'
| 'windows-1252'
| ...
Character encoding scheme associated with the file, specified as the comma-separated pair consisting of 'Encoding'
and 'system'
or a standard character encoding scheme name like one of the values in this table. When you do not specify any encoding or specify encoding as 'system'
, the writing function uses your system default encoding to write the file.
"Big5" | "ISO-8859-1" | "windows-874" |
---|---|---|
"Big5-HKSCS" | "ISO-8859-2" | "windows-949" |
"CP949" | "ISO-8859-3" | "windows-1250" |
"EUC-KR" | "ISO-8859-4" | "windows-1251" |
"EUC-JP" | "ISO-8859-5" | "windows-1252" |
"EUC-TW" | "ISO-8859-6" | "windows-1253" |
"GB18030" | "ISO-8859-7" | "windows-1254" |
"GB2312" | "ISO-8859-8" | "windows-1255" |
"GBK" | "ISO-8859-9" | "windows-1256" |
"IBM866" | "ISO-8859-11" | "windows-1257" |
"KOI8-R" | "ISO-8859-13" | "windows-1258" |
"KOI8-U" | "ISO-8859-15" | "US-ASCII" |
"Macintosh" | "UTF-8" | |
"Shift_JIS" |
Example: 'UTF-8'
Data Types: char
| string
QuoteStrings
— Indicator for writing quoted text
"minimal"
(default) | "all"
| "none"
Indicator for writing quoted text, specified as "minimal"
,"all"
, or "none"
.
- If
QuoteStrings
is"minimal"
, then the function encloses any variables containing the delimiter, line ending, or double-quote character in double-quote characters. - If
QuoteStrings
is"all"
, then the function encloses all text, categorical, datetime, and duration variables in double-quote characters. - If
QuoteStrings
is"none"
, then the function does not enclose variables in double-quote characters.
You can use the QuoteStrings
name-value argument only with delimited text files.
Options for xls
or xlsx
Output
Sheet
— Worksheet to write to
character vector | string scalar | positive integer
Worksheet to write to, specified as the comma-separated pair consisting of'Sheet'
and a character vector or a string scalar containing the worksheet name or a positive integer indicating the worksheet index. The worksheet name cannot contain a colon (:
). To determine the names of sheets in a spreadsheet file, use sheets = sheetnames(filename)
. For more information, see sheetnames.
Specify the worksheet to write to by name or index:
- name — If the specified sheet name does not exist in the file, then the writing function adds a new sheet at the end of the worksheet collection.
- index — If the specified sheet index is an index larger than the number of worksheets, then the writing function appends empty sheets until the number of worksheets in the workbook equals the sheet index. The writing function also generates a warning indicating that it has added a new worksheet.
You can use the 'Sheet'
name-value pair only with spreadsheet files.
Example: 'Sheet'
,2
Example: 'Sheet'
, 'MySheetName'
Data Types: char
| string
| single
| double
| int8
| int16
| int32
| int64
| uint8
| uint16
| uint32
| uint64
Extended Capabilities
Version History
Introduced in R2020a