spreadsheetImportOptions - Import options object for Spreadsheets - MATLAB (original) (raw)

Import options object for Spreadsheets

Description

A SpreadsheetImportOptions object enables you to specify how MATLAB® imports tabular data from spreadsheet files. The object contains properties that control the data import process, including the handling of errors and missing data.

Creation

You can create a SpreadsheetImportOptions object by using either the detectImportOptions function or thespreadsheetImportOptions function (described here):

Syntax

Description

opts = spreadsheetImportOptions creates aSpreadsheetImportOptions object with one variable.

example

opts = spreadsheetImportOptions('NumVariables',[numVars](#bvetiw8-1%5Fsep%5Fmw%5Ff5acc160-778c-42ed-a7f2-be23ca93c208)) creates the object with the number of variables specified innumVars.

example

opts = spreadsheetImportOptions(___,`Name,Value`) specifies additional properties forSpreadsheetImportOptions object using one or more name-value pair arguments.

example

Input Arguments

expand all

numVars — Number of variables

positive scalar integer

Number of variables, specified as a positive scalar integer.

Properties

expand all

Variable Properties

Variable names, specified as a cell array of character vectors or string array. TheVariableNames property contains the names to use when importing variables.

If the data contains N variables, but no variable names are specified, then the VariableNames property contains{'Var1','Var2',...,'VarN'}.

To support invalid MATLAB identifiers as variable names, such as variable names containing spaces and non-ASCII characters, set the value of VariableNamingRule to'preserve'.

Example: opts.VariableNames returns the current (detected) variable names.

Example: opts.VariableNames(3) = {'Height'} changes the name of the third variable to Height.

Data Types: char | string | cell

VariableNamingRule — Flag to preserve variable names

"modify" (default) | "preserve"

Flag to preserve variable names, specified as either "modify" or"preserve".

Starting in R2019b, variable names and row names can include any characters, including spaces and non-ASCII characters. Also, they can start with any characters, not just letters. Variable and row names do not have to be valid MATLAB identifiers (as determined by the isvarname function). To preserve these variable names and row names, set the value of VariableNamingRule to "preserve". Variable names are not refreshed when the value of VariableNamingRule is changed from "modify" to "preserve".

Data Types: char | string

Data type of variable, specified as a cell array of character vectors, or string array containing a set of valid data type names. The VariableTypes property designates the data types to use when importing variables.

To update the VariableTypes property, use the setvartype function.

Example: opts.VariableTypes returns the current variable data types.

Example: opts = setvartype(opts,'Height',{'double'}) changes the data type of the variable Height todouble.

Subset of variables to import, specified as a character vector, string scalar, cell array of character vectors, string array or an array of numeric indices.

SelectedVariableNames must be a subset of names contained in the VariableNames property. By default, SelectedVariableNames contains all the variable names from the VariableNames property, which means that all variables are imported.

Use the SelectedVariableNames property to import only the variables of interest. Specify a subset of variables using the SelectedVariableNames property and use readtable to import only that subset.

To support invalid MATLAB identifiers as variable names, such as variable names containing spaces and non-ASCII characters, set the value ofVariableNamingRule to'preserve'.

Example: opts.SelectedVariableNames = {'Height','LastName'} selects only two variables, Height and LastName, for the import operation.

Example: opts.SelectedVariableNames = [1 5] selects only two variables, the first variable and the fifth variable, for the import operation.

Example: T = readtable(filename,opts) returns a table containing only the variables specified in the SelectedVariableNames property of the opts object.

Data Types: uint16 | uint32 | uint64 | char | string | cell

Type specific variable import options, returned as an array of variable import options objects. The array contains an object corresponding to each variable specified in the VariableNames property. Each object in the array contains properties that support the importing of data with a specific data type.

Variable options support these data types: numeric, text, logical, datetime, or categorical.

To query the current (or detected) options for a variable, use the getvaropts function.

To set and customize options for a variable, use the setvaropts function.

Example: opts.VariableOptions returns a collection of VariableImportOptions objects, one corresponding to each variable in the data.

Example: getvaropts(opts,'Height') returns the VariableImportOptions object for the Height variable.

Example: opts = setvaropts(opts,'Height','FillValue',0) sets the FillValue property for the variable Height to 0.

Location Properties

Sheet to read from, specified as an empty character array, a character vector or string scalar containing the sheet name, or a positive scalar integer denoting the sheet index. Based on the value specified for the Sheet property, the import function behaves as described in the table.

Specification Behavior
'' (default) Import data from the first sheet.
Name Import data from the matching sheet name, regardless of order of sheets in the spreadsheet file.
Integer Import data from sheet in the position denoted by the integer, regardless of the sheet names in the spreadsheet file.

Data Types: char | string | single | double

DataRange — Location of data

character vector | string scalar | positive scalar integer | array of positive scalar integers | cell array of character vector | string array

Location of data to be imported, specified as a character vector, string scalar, cell array of character vectors, string array, positive scalar integer or an N-by-2 array of positive scalar integers. Specify DataRange using one of these forms.

Specified by Behavior
'Cell' or _n_Starting Cell or Starting Row Specify the starting cell for the data, using Excel® A1 notation. For example, A5 is the identifier for the cell at the intersection of column A and row 5.Using the starting cell, the importing function automatically detects the extent of the data, by beginning the import at the start cell and ending at the last empty row or footer range.Alternatively, specify the first row containing the data using the positive scalar row index. Using the specified row index, the importing function automatically detects the extent of the data by reading from the specified first row to the end of the data or the footer range.Example: 'A5' or 5
'Corner1:Corner2'Rectangular Range Specify the exact range to read using the rectangular range form, where Corner1 and Corner2 are two opposing corners that define the region to read.The importing function only reads the data contained in the specified range. Any empty fields within the specified range are imported as missing cells.The number of columns must match the number specified in the NumVariables property.Example: 'A5:K50'
'Row1:Row2' or 'Column1:Column2'Row Range or Column Range Specify the range by identifying the beginning and ending rows using Excel row numbers. Using the specified row range, the importing function automatically detects the column extent by reading from the first nonempty column to the end of the data, and creates one variable per column.Example: '5:500'Alternatively, specify the range by identifying the beginning and ending columns using Excel column letters or numbers.Using the specified column range, the import function automatically detects the row extent by reading from the first nonempty row to the end of the data or the footer range.The number of columns in the specified range must match the number specified in the NumVariables property.Example: 'A:K'
[n1 n2; n3 n4;...] Multiple Row Ranges Specify multiple row ranges to read with an N-by-2 array containing N different row ranges.A valid array of multiple row ranges must: Specify line ranges in an increasing order, that is the first row range specified in the array appears in the file before the other row ranges.Contain only non-overlapping row ranges.Use of Inf is only supported to indicate the last range in the numeric array specifying multiple row ranges. For example, [1 3; 5 6; 8 Inf].Example: [1 3; 5 6; 8 Inf]
''Unspecified or Empty Do not fetch any data.Example: ''

Data Types: char | string | cell | single | double

RowNamesRange — Location of row names

character vector | string scalar | positive scalar integer | '' empty character array

Location of row names, specified as a character vector, string scalar, positive scalar integer, or an empty character array. Specify RowNamesRange as one of the values in this table.

Specified by Behavior
'Cell' Specify the starting cell for the data, using Excel A1 notation. For example, A5 is the identifier for the cell at the intersection of column A and row 5.The importing function identifies a name for each variable in the data.Example: 'A5'
'Corner1:Corner2'Rectangular Range Specify the exact range to read using the rectangular range form, where Corner1 and Corner2 are two opposing corners that define the region to read.The number of rows contained in RowNamesRange must match the number of data rows, and the range indicated by RowNamesRange must span only one column.Example: 'A5:A50'
'Row1:Row2'Row Range Specify range by identifying the beginning and ending rows using Excel row numbers.Row names must be in a single column.Example: '5:50'
_n_Number Index Specify the column containing the row names using a positive scalar column index. Example: 5
''Unspecified or Empty Indicate that there are no row names.Example: ''

Data Types: char | single | double

VariableNamesRange — Location of variable names

character vector | string scalar | positive scalar integer | '' empty character array

Location of variable names, specified as a character vector, string scalar, positive scalar integer, or an empty character array. Specify VariableNamesRange as one of the values in this table.

Specified by Behavior
'Cell' Specify the starting cell for the data, using Excel A1 notation. For example, A5 is the identifier for the cell at the intersection of column A and row 5.The importing function reads a name for each variable in the data.Example: 'A5'
'Corner1:Corner2'Rectangular Range Specify the exact range to read using the rectangular range form, where Corner1 and Corner2 are two opposing corners that define the region to read.The number of columns must match the number specified in the NumVariables property, and the range must span only one row.Example: 'A5:K5'
'Row1:Row2'Row Range Specify range by identifying the beginning and ending rows using Excel row numbers. Must be a single row.Example: '5:5'
_n_Number Index Specify the row containing the variable names using a positive scalar row index. Example: 5
''Unspecified or Empty Indicate that there are no variable names.Example: ''

Data Types: char | single | double

VariableDescriptionsRange — Location of variable descriptions

character vector | string scalar | '' empty character array

Location of variable descriptions, specified as a character vector, string scalar, positive scalar integer, or an empty character array. Specify VariableDescriptionRange as one of the values in this table.

Specified by Behavior
'Cell' Specify the starting cell for the data, using Excel A1 notation. For example, A5 is the identifier for the cell at the intersection of column A and row 5.The importing function reads a description for each variable in the data.Example: 'A5'
'Corner1:Corner2'Rectangular Range Specify the exact range to read using the rectangular range form, where Corner1 and Corner2 are two opposing corners that define the region to read.The number of columns must match the number specified in the NumVariables property, and the range must span only one row.Example: 'A5:K5'
'Row1:Row2'Row Range Specify range by identifying the beginning and ending rows using Excel row numbers. Must be a single row.Example: '5:5'
_n_Number Index Specify the row containing the descriptions using a positive scalar row index. Example: 5
''Unspecified or Empty Indicate that there are no variable descriptions.Example: ''

Data Types: char | string | single | double

VariableUnitsRange — Location of variable units

character vector | string scalar | positive scalar integer | '' empty character array

Location of variable units, specified as a character vector, string scalar, positive scalar integer, or an empty character array. Specify VariableUnitsRange as one of the values in this table.

Specified by Behavior
'Cell' Specify the starting cell for the data, using Excel A1 notation. For example, A5 is the identifier for the cell at the intersection of column A and row 5.The importing function reads a unit for each variable in the data.Example: 'A5'
'Corner1:Corner2'Rectangular Range Specify the exact range to read using the rectangular range form, where Corner1 and Corner2 are two opposing corners that define the region to read.The number of columns must match the number specified in the NumVariables property, and the range must span only one row.Example: 'A5:K5'
'Row1:Row2'Row Range Specify range by identifying the beginning and ending rows using Excel row numbers. Must be a single row.Example: '5:5'
_n_Number Index Specify the row containing the data units using a positive scalar row index. Example: 5
''Unspecified or Empty Indicate that there are no variable units.Example: ''

Data Types: char | string | single | double

Replacement Rules

Procedure to manage missing data, specified as one of the values in this table.

Missing Rule Behavior
'fill' Replace missing data with the contents of the FillValue property.The FillValue property is specified in the VariableImportOptions object of the variable being imported. For more information on accessing theFillValue property, see setvaropts.
'error' Stop importing and display an error message showing the missing record and field.
'omitrow' Omit rows that contain missing data.
'omitvar' Omit variables that contain missing data.

Example: opts.MissingRule = 'omitrow';

Data Types: char | string

Procedure to handle import errors, specified as one of the values in this table.

Import Error Rule Behavior
'fill' Replace the data where the error occurred with the contents of theFillValue property.TheFillValue property is specified in theVariableImportOptions object of the variable being imported. For more information on accessing theFillValue property, see setvaropts.
'error' Stop importing and display an error message showing the error-causing record and field.
'omitrow' Omit rows where errors occur.
'omitvar' Omit variables where errors occur.

Example: opts.ImportErrorRule = 'omitvar';

Data Types: char | string

Object Functions

Examples

collapse all

Define Import Options for Variable in Spreadsheet File

Create an import options object to read one variable from the file patients.xls.

Initialize a SpreadsheetImportOptions object, specify the variable name, variable type, and the data starting cell. Then, use the preview function with the import options object to preview the data.

opts = spreadsheetImportOptions; opts.VariableNames = 'LastName'; opts.VariableTypes = 'categorical'; opts.DataRange = 'A2'; preview('patients.xls',opts)

ans=8×1 table LastName ________

Smith   
Johnson 
Williams
Jones   
Brown   
Davis   
Miller  
Wilson  

Import the variable from the file.

oneVar = readtable('patients.xls',opts); whos oneVar

Name Size Bytes Class Attributes

oneVar 100x1 14309 table

Define Import Options for Multiple Variables in Spreadsheet File

Create an import options object to read multiple variables from patients.xls.

Based on the contents of your file, define the number of variables, variable names, variable types, and the data starting location.

numVars = 7; varNames = {'LastName','Gender','Age','Location','Height','Weight','Smoker'} ; varTypes = {'char','categorical','int32','char','double','double','logical'} ; dataStartLoc = 'A2';

Initialize the import options object opts using the spreadsheetImportOptions function and your variable information.

opts = spreadsheetImportOptions('NumVariables',numVars,... 'VariableNames',varNames,... 'VariableTypes',varTypes,... 'DataRange', dataStartLoc);

Preview the data using the preview function with the import options.

preview('patients.xls',opts)

ans=8×7 table LastName Gender Age Location Height Weight Smoker ____________ ______ ___ _____________________________ ______ ______ ______

{'Smith'   }    Male      38     {'County General Hospital'  }      71       176      true  
{'Johnson' }    Male      43     {'VA Hospital'              }      69       163      false 
{'Williams'}    Female    38     {'St. Mary's Medical Center'}      64       131      false 
{'Jones'   }    Female    40     {'VA Hospital'              }      67       133      false 
{'Brown'   }    Female    49     {'County General Hospital'  }      64       119      false 
{'Davis'   }    Female    46     {'St. Mary's Medical Center'}      68       142      false 
{'Miller'  }    Female    33     {'VA Hospital'              }      64       142      true  
{'Wilson'  }    Male      40     {'VA Hospital'              }      68       180      false 

Import the data using readtable.

T = readtable('patients.xls',opts); whos T

Name Size Bytes Class Attributes

T 100x7 33987 table

Version History

Introduced in R2016b

expand all

R2018b: Create options object using spreadsheetImportOptions function

Use the spreadsheetImportOptions function to create aSpreadsheetImportOptions object. Previously, you could create this object only by using the detectImportOptions function.