fixedWidthImportOptions - Import options object for fixed-width text files - MATLAB (original) (raw)
Import options object for fixed-width text files
Description
A FixedWidthImportOptions
object enables you to specify how MATLAB® imports fixed-width tabular data from text files. The object contains properties that control the data import process, including the handling of errors and missing data.
Creation
You can create a FixedWidthImportOptions
object using either thefixedWidthImportOptions
function (described here) or thedetectImportOptions function:
- Use
fixedWidthImportOptions
to define the import properties based on your import requirements. - Use
detectImportOptions
to detect and populate the import properties based on the contents of the fixed-width text file specified infilename
.
opts = detectImportOptions(filename)
Syntax
Description
opts = fixedWidthImportOptions
creates aFixedWidthImportOptions
object with one variable.
opts = fixedWidthImportOptions('NumVariables',[numVars](#bvmdw9y-1%5Fsep%5Fmw%5Ff5acc160-778c-42ed-a7f2-be23ca93c208))
creates the object with the number of variables specified innumVars
.
opts = fixedWidthImportOptions(___,`Name,Value`)
specifies additional properties forFixedWidthImportOptions
object using one or more name-value pair arguments.
Input Arguments
Number of variables, specified as a positive scalar integer.
Properties
Variable Properties
Data Types: char
| string
| cell
Flag to preserve variable names, specified as either "modify"
or"preserve"
.
"modify"
— Convert invalid variable names (as determined by the isvarname function) to valid MATLAB identifiers."preserve"
— Preserve variable names that are not valid MATLAB identifiers such as variable names that include spaces and non-ASCII characters.
Starting in R2019b, variable names and row names can include any characters, including spaces and non-ASCII characters. Also, they can start with any characters, not just letters. Variable and row names do not have to be valid MATLAB identifiers (as determined by the isvarname function). To preserve these variable names and row names, set the value of VariableNamingRule
to "preserve"
. Variable names are not refreshed when the value of VariableNamingRule
is changed from "modify"
to "preserve"
.
Data Types: char
| string
Field widths of variables in a fixed-width text file, specified as a vector of positive integer values. Each positive integer in the vector corresponds to the number of characters in a field that makes up the variable. The VariableWidths
property contains an entry corresponding to each variable specified in theVariableNames
property.
Data Types: uint16
| uint32
| uint64
| char
| string
| cell
Location Properties
Data Types: single
| double
| uint8
| uint16
| uint32
| uint64
Data Types: single
| double
| uint8
| uint16
| uint32
| uint64
Data Types: single
| double
| uint8
| uint16
| uint32
| uint64
Data Types: single
| double
| uint8
| uint16
| uint32
| uint64
Data Types: single
| double
| uint8
| uint16
| uint32
| uint64
Delimited Text Properties
Data Types: char
| string
Replacement Rules
Data Types: char
| string
Data Types: char
| string
Data Types: char
| string
Data Types: char
| string
Data Types: char
| string
Object Functions
Examples
Examine a fixed-width formatted text file, initialize an import options object, and use the object to import the table from the text file.
Load and Preview Fixed-Width Text File
Load the file fixed_width_patients_subset_perfect.txt
and preview its contents in a text editor. The screen shot shows that the file contains fixed-width formatted data.
filename = 'fixed_width_patients_subset_perfect.txt';
Examine and Extract Properties of Fixed-Width File
The fixed-width text file has tabular data organized by starting location, number of variables, variable names, and variable widths. Capture these properties and the desired data type for the variables.
DataStartLine = 2;
NumVariables = 7;
VariableNames = {'LastName','Gender','Age','Location','Height',...
'Weight','Smoker'};
VariableWidths = [ 10, 7, 4, 26, 7, ...
7, 7 ] ;
DataType = {'char','categorical','double','char','double',...
'double','logical'};
Initialize and Configure FixedWidthImportOptions
Object
Initialize a FixedWidthImportOptions
object and configure its properties to match the properties of the data in fixed_width_patients_subset_perfect.txt
.
opts = fixedWidthImportOptions('NumVariables',NumVariables,... 'DataLines',DataStartLine,... 'VariableNames',VariableNames,... 'VariableWidths',VariableWidths,... 'VariableTypes',DataType);
Import Table
Use readtable
with the FixedWidthImportOptions
object to import the table.
T = readtable(filename,opts)
T=10×7 table LastName Gender Age Location Height Weight Smoker ____________ ______ ___ _____________________________ ______ ______ ______
{'Smith' } Male 38 {'County General Hospital' } 71 176 true
{'Johnson' } Male 43 {'VA Hospital' } 69 163 false
{'Williams'} Female 38 {'St. Mary's Medical Center'} 64 131 false
{'Brown' } Female 49 {'County General Hospital' } 64 119 false
{'Miller' } Female 33 {'VA Hospital' } 64 142 true
{'Wilson' } Male 40 {'VA Hospital' } 68 180 false
{'Taylor' } Female 31 {'County General Hospital' } 66 132 false
{'Thomas' } Female 42 {'St. Mary's Medical Center'} 66 137 false
{'Jackson' } Male 25 {'VA Hospital' } 71 174 false
{'Clark' } Female 48 {'VA Hospital' } 65 133 false
Define an import options object to import messy data from a fixed-width formatted text file. Configure the object to handle the messy data and use it to import the table.
Load and Preview Fixed-Width Text File
Load the file fixed_width_patients_subset_messy.txt
and preview its contents in a text editor. A screen shot is shown below. The screen shot shows that the file contains:
- Empty lines – Lines 7, 12, and 13
- An extra column – Column 8
- Missing data – Lines 1, 4, 9 and 11
- Partial fields – Last 3 rows
filename = 'fixed_width_patients_subset_messy.txt';
Examine and Capture Properties of Fixed-Width File
The fixed-width text file has tabular data organized by the starting location, number of variables, variable names, and variable widths. Capture these properties and the data type you want to use for the variables.
DataStartLine = 2;
NumVariables = 7;
VariableNames = {'LastName','Gender','Age','Location','Height',...
'Weight','Smoker'};
VariableWidths = [ 10, 7, 4, 26, 7, ...
7, 7 ] ;
DataType = {'char','categorical','double','char','double',...
'double','logical'};
Initialize FixedWidthImportOptions
Object and Set Up Variable Properties
Initialize a FixedWidthImportOptions
object and configure its properties to match the properties of the data.
opts = fixedWidthImportOptions('NumVariables',NumVariables,... 'DataLines',DataStartLine,... 'VariableNames',VariableNames,... 'VariableWidths',VariableWidths,... 'VariableTypes',DataType);
Set Up EmptyLinesRule
, Missing Rule
, and ExtraColumnsRule
Read the empty lines in the data by setting the EmptyLineRule
to 'read'
. Next, fill the missing instances with predefined values by setting the MissingRule
to 'fill'
. Finally, to ignore the extra column during the import, set the ExtraColumnsRule
to 'ignore'
. For more information on the properties and their values, see documentation for FixedWidthImportOptions
.
opts.EmptyLineRule = 'read'; opts.MissingRule = 'fill'; opts.ExtraColumnsRule ='ignore';
Set Up PartialFieldRule
Partial fields occur when the importing function reaches the end-of-line character before the full variable width is traversed. For example, in this preview, the last three rows from the file fixed_width_patients_subset_messy.txt
. Here, in the last row of the last column, the end-of-line character appears after two places from the start of the field, before the full variable-width of three is reached.
This occurrence of a partial field sometimes can indicate an error. Therefore, use the PartialFieldRule
to decide how to handle this data. To keep the partial field data and convert it to the appropriate data type, set the PartialFieldRule
to 'keep'
. For more information on the PartialFieldRule
, see documentation for FixedWidthImportOptions
.
opts.PartialFieldRule = 'keep';
Import Table
Import the table by using readtable
function and the FixedWidthImportOptions
object and preview the data.
T = readtable(filename,opts)
T=15×7 table LastName Gender Age Location Height Weight Smoker ____________ ___________ ___ _____________________________ ______ ______ ______
{'Smith' } Male 38 {'County General Hospital' } 71 176 true
{'Johnson' } Male 43 {'VA Hospital' } 69 163 false
{'Williams'} Female 38 {'St. Mary's Medical Center'} NaN NaN false
{'Jones' } Female 40 {'VA Hospital' } 67 133 false
{'Brown' } Female 49 {'County General Hospital' } 64 119 false
{0×0 char } <undefined> NaN {0×0 char } NaN NaN false
{'Wilson' } Male 40 {'VA Hospital' } 68 180 false
{'Moore' } Male 28 {'St. Mary's Medical Center'} NaN 183 false
{'Taylor' } Female 31 {'County General Hospital' } 66 132 false
{'Anderson'} Female 45 {'County General Hospital' } 68 NaN false
{0×0 char } <undefined> NaN {0×0 char } NaN NaN false
{0×0 char } <undefined> NaN {0×0 char } NaN NaN false
{'White' } Male 39 {'VA Hospital' } 72 2 false
{'Harris' } Female 36 {'St. Mary's Medical Center'} 65 12 false
{'Martin' } Male 48 {'VA Hospital' } 71 181 true
Version History
Introduced in R2016b
Use the fixedWidthImportOptions
function to create aFixedWidthImportOptions
object. Previously, you could create this object only by using the detectImportOptions
function.