readtable - Create table from file - MATLAB (original) (raw)

Syntax

Description

[T](#btx%5F238-1%5Fsep%5Fshared-T) = readtable([filename](#mw%5Fb0826a11-ea6e-45f6-92ce-54960958581f)) creates a table by reading column-oriented data from a text file, spreadsheet (includingMicrosoft® Excel®) file, XML file, HTML file, or a Microsoft Word document. readtable detects elements of your data, such as delimiter and data types, to determine how to import your data.

example

[T](#btx%5F238-1%5Fsep%5Fshared-T) = readtable([filename](#mw%5Fb0826a11-ea6e-45f6-92ce-54960958581f),[Name=Value](#namevaluepairarguments)) specifies options using one or more name-value arguments. For example, you can read the first row of the file as variable names or as data by using the ReadVariableNames name-value argument.

example

[T](#btx%5F238-1%5Fsep%5Fshared-T) = readtable([filename](#mw%5Fb0826a11-ea6e-45f6-92ce-54960958581f),[opts](#mw%5Fda8f54b7-5bd3-4640-9eb5-715f76a3a7f0)) creates a table using the options specified by the import options objectopts. Use an import options object to configure howreadtable interprets your file. Compared to name-value arguments, an import options object provides more control, better performance, and reusability of the file import configuration.

example

[T](#btx%5F238-1%5Fsep%5Fshared-T) = readtable([filename](#mw%5Fb0826a11-ea6e-45f6-92ce-54960958581f),[opts](#mw%5Fda8f54b7-5bd3-4640-9eb5-715f76a3a7f0),[Name=Value](#namevaluepairarguments)) creates a table using both an import options object and name-value arguments. If you specify name-value arguments in addition to an import options object, then readtable supports only the ReadVariableNames, ReadRowNames,DateLocale, and Encoding name-value arguments for text files, and the ReadVariableNames, ReadRowNames,Sheet, and UseExcel name-value arguments for spreadsheet files.

Examples

collapse all

Import the contents of a text file into a table. The resulting table contains one variable for each column in the file and uses the entries in the first line of the file as variable names.

File contents of myCsvTable.dat

T = readtable("myCsvTable.dat")

T=5×6 table LastName Gender Age Height Weight Smoker ____________ ______ ___ ______ ______ ______

{'Smith'   }    {'M'}     38       71       176        1   
{'Johnson' }    {'M'}     43       69       163        0   
{'Williams'}    {'F'}     38       64       131        0   
{'Jones'   }    {'F'}     40       67       133        0   
{'Brown'   }    {'F'}     49       64       119        0   

Create a table from a text file that contains data gaps. By default, readtable fills the gaps with the appropriate missing values.

T = readtable("headersAndMissing.txt")

T=5×6 table LastName Gender Age Height Weight Smoker ___________ __________ ___ ______ ______ ______

{'Wu'     }    {'M'     }     38      71       176        1   
{'Johnson'}    {'M'     }     43      69       163        0   
{'Sanchez'}    {'F'     }     38      64       131        0   
{'Brown'  }    {'F'     }    NaN      67       133        0   
{'Picard' }    {0×0 char}    NaN      64       119        0   

To omit the rows with the data gaps, specify the MissingRule name-value argument.

T = readtable("headersAndMissing.txt",MissingRule="omitrow")

T=3×6 table LastName Gender Age Height Weight Smoker ___________ ______ ___ ______ ______ ______

{'Wu'     }    {'M'}     38       71       176        1   
{'Johnson'}    {'M'}     43       69       163        0   
{'Sanchez'}    {'F'}     38       64       131        0   

Configure how readtable interprets your file using an import options object. For example, use an import options object to read only a subset of a text file.

First, create an import options object by using detectImportOptions to detect aspects of your text file, including variable names and types, delimiters, and white-space characters. In this case, detectImportOptions creates a DelimitedTextImportOptions object.

opts = detectImportOptions("airlinesmall.csv")

opts = DelimitedTextImportOptions with properties:

Format Properties: Delimiter: {','} Whitespace: '\b\t ' LineEnding: {'\n' '\r' '\r\n'} CommentStyle: {} ConsecutiveDelimitersRule: 'split' LeadingDelimitersRule: 'keep' TrailingDelimitersRule: 'ignore' EmptyLineRule: 'skip' Encoding: 'ISO-8859-1'

Replacement Properties: MissingRule: 'fill' ImportErrorRule: 'fill' ExtraColumnsRule: 'addvars'

Variable Import Properties: Set types by name using setvartype VariableNames: {'Year', 'Month', 'DayofMonth' ... and 26 more} VariableTypes: {'double', 'double', 'double' ... and 26 more} SelectedVariableNames: {'Year', 'Month', 'DayofMonth' ... and 26 more} VariableOptions: Show all 29 VariableOptions Access VariableOptions sub-properties using setvaropts/getvaropts VariableNamingRule: 'modify'

Location Properties: DataLines: [2 Inf] VariableNamesLine: 1 RowNamesColumn: 0 VariableUnitsLine: 0 VariableDescriptionsLine: 0 To display a preview of the table, use preview

Specify the subset of variables to import by modifying the import options object. Then, import the subset of data using readtable with the import options object.

opts.SelectedVariableNames = ["TaxiIn","TaxiOut"]; T = readtable("airlinesmall.csv",opts);

Create a table from a spreadsheet that contains variable names in the first row and row names in the first column. Display the first five rows and first four variables of the table.

T = readtable("patients.xls",ReadRowNames=true); T(1:5,1:4)

ans=5×4 table Gender Age Location Height __________ ___ _____________________________ ______

Smith       {'Male'  }    38     {'County General Hospital'  }      71  
Johnson     {'Male'  }    43     {'VA Hospital'              }      69  
Williams    {'Female'}    38     {'St. Mary's Medical Center'}      64  
Jones       {'Female'}    40     {'VA Hospital'              }      67  
Brown       {'Female'}    49     {'County General Hospital'  }      64  

Create a table using data from a specified region of a spreadsheet. Use the data from the 5-by-3 rectangular region between the corners C2 and E6. Do not use the first row of this region as variable names. The resulting table uses the default variable names instead.

T = readtable("patients.xls",Range="C2:E6",ReadVariableNames=false)

T=5×3 table Var1 Var2 Var3 ____ _____________________________ ____

 38     {'County General Hospital'  }     71 
 43     {'VA Hospital'              }     69 
 38     {'St. Mary's Medical Center'}     64 
 40     {'VA Hospital'              }     67 
 49     {'County General Hospital'  }     64 

Configure how readtable interprets your file using an import options object. For example, use an import options object to read only specified variables from a spreadsheet file.

First, create an import options object from a file by using detectImportOptions to detect aspects of your spreadsheet file, including variable names and types. In this case, detectImportOptions creates a SpreadsheetImportOptions object.

opts = detectImportOptions("patients.xls")

opts = SpreadsheetImportOptions with properties:

Sheet Properties: Sheet: ''

Replacement Properties: MissingRule: 'fill' ImportErrorRule: 'fill' MergedCellColumnRule: 'placeleft' MergedCellRowRule: 'placetop'

Variable Import Properties: Set types by name using setvartype VariableNames: {'LastName', 'Gender', 'Age' ... and 7 more} VariableTypes: {'char', 'char', 'double' ... and 7 more} SelectedVariableNames: {'LastName', 'Gender', 'Age' ... and 7 more} VariableOptions: Show all 10 VariableOptions Access VariableOptions sub-properties using setvaropts/getvaropts VariableNamingRule: 'modify'

Range Properties: DataRange: 'A2' (Start Cell) VariableNamesRange: 'A1' RowNamesRange: '' VariableUnitsRange: '' VariableDescriptionsRange: '' To display a preview of the table, use preview

Specify which variables to import by modifying the import options object. Then, import the specified variables using readtable with the import options object. Display the first 5 rows of the table.

opts.SelectedVariableNames = ["Systolic","Diastolic"]; T = readtable("patients.xls",opts); T(1:5,:)

ans=5×2 table Systolic Diastolic ________ _________

  124          93    
  109          77    
  125          83    
  117          75    
  122          80    

Import the contents of an XML file into a table.

The students.xml file has seven sibling nodes named Student, which each contain the same child nodes and attributes.

18 Freshman
591 Spring Lane Natick MA
Computer Science English Literature 18 Freshman
4641 Pearl Street San Francisco CA
Microbiology Public Health 21 Senior
30 Highland Road Detriot MI
Political Science 19 Sophomore
3388 Moore Avenue Fort Worth TX
Business Japanese Language 20 Sophomore
3388 Moore Avenue Fort Worth TX
Business Japanese Language 22 Senior
22 Angie Drive Los Angeles CA
Mechanical Engineering Architecture 21 Junior
65 Decatur Lane Trenton ME
Economics Art History

First, create an XMLImportOptions object by using detectImportOptions to detect aspects of your XML file. Read just the street names into a table by specifying the VariableSelectors name-value argument as the XPath expression of the Street element node. Register a custom namespace prefix to the existing namespace URL by setting the RegisteredNamespaces name-value argument.

opts = detectImportOptions("students.xml",RegisteredNamespaces=["myPrefix","https://www.mathworks.com"], ... VariableSelectors="//myPrefix:Street");

Then, import the specified variable using readtable with the import options object.

T = readtable("students.xml",opts)

T=7×1 table Street
___________________

"591 Spring Lane"  
"4641 Pearl Street"
"30 Highland Road" 
"3388 Moore Avenue"
"3388 Moore Avenue"
"22 Angie Drive"   
"65 Decatur Lane"  

Import a table from a Microsoft Word document into a table in MATLAB. In this case, the document contains two tables, and the second table contains merged cells. Read the second table by setting the TableIndex name-value argument. Skip rows that have cells with merged columns by setting the MergedCellColumnRule name-value argument.

filename = "MaintenanceReport.docx"; T = readtable(filename,TableIndex=2,MergedCellColumnRule="omitrow")

T=3×5 table Description Category Urgency Resolution Cost
_____________________________________________________________________ ____________________ ________ __________________ ________

"Items are occasionally getting stuck in the scanner spools."            "Mechanical Failure"    "Medium"    "Readjust Machine"    "$45"   
"Loud rattling and banging sounds are coming from assembler pistons."    "Mechanical Failure"    "Medium"    "Readjust Machine"    "$35"   
"There are cuts to the power when starting the plant."                   "Electronic Failure"    "High"      "Full Replacement"    "$16200"

Alternatively, you can select a table with an XPath selector by using the TableSelector name-value argument. To select the Microsoft Word document table that contains the text "Description", use the XPath selector "//w:tbl[contains(.,'Description')]".

T = readtable(filename, ... TableSelector="//w:tbl[contains(.,'Description')]", ... MergedCellColumnRule="omitrow")

T=3×5 table Description Category Urgency Resolution Cost
_____________________________________________________________________ ____________________ ________ __________________ ________

"Items are occasionally getting stuck in the scanner spools."            "Mechanical Failure"    "Medium"    "Readjust Machine"    "$45"   
"Loud rattling and banging sounds are coming from assembler pistons."    "Mechanical Failure"    "Medium"    "Readjust Machine"    "$35"   
"There are cuts to the power when starting the plant."                   "Electronic Failure"    "High"      "Full Replacement"    "$16200"

Import the first table from the URL https://www.mathworks.com/help/matlab/text-files.html that contains the text "readtable" using the XPath selector "//TABLE[contains(.,'readtable')]". The table does not have a header row, so set the ReadVariableNames name-value argument to false.

url = "https://www.mathworks.com/help/matlab/text-files.html"; T = readtable(url,TableSelector="//TABLE[contains(.,'readtable')]", ... ReadVariableNames=false)

T=4×2 table Var1 Var2
________________ ____________________________

"readtable"         "Create table from file"    
"writetable"        "Write table to file"       
"readtimetable"     "Create timetable from file"
"writetimetable"    "Write timetable to file"   

Input Arguments

collapse all

Name of the file to read, specified as a string scalar or character vector.readtable supports reading data from text, spreadsheet, XML, Microsoft Word, and HTML files.

If filename does not include an extension, use theFileType name-value argument to indicate the file format. By default,readtable creates variables that have data types that are appropriate for the data values detected in each column of the input file.

Depending on the location of your file, filename can take one of these forms.

Location Form
Current folder or folder on the MATLAB® path Specify the name of the file infilename.Example: "myFile.txt"
File in a folder If the file is not in the current folder or in a folder on the MATLAB path, then specify the full or relative pathname infilename.Example: "C:\myFolder\myFile.xlsx"Example: "dataDir\myFile.txt"
Internet URL If the file is specified as an internet uniform resource locator (URL), thenfilename must contain the protocol type "http://" or "https://".Example: "http://hostname/path\_to\_file/my\_data.csv"
Remote location If the file is stored at a remote location, then filename must contain the full path of the file specified with the form:scheme_name://path_to_file/_my_file.ext_Based on the remote location,scheme_name can be one of the values in this table. Remote Locationscheme_nameAmazon S3™s3Windows Azure® Blob Storagewasb, wasbsHDFS™hdfsFor more information, see Work with Remote Data.Example: "s3://bucketname/path_to_file/my_file.csv"Example: "wasbs://path_to_file/my_file.csv"Example: "hdfs:///path_to_file/my_file.csv"

Text Files

For commonly used text file workflows, see Import Data from Text File to Table.

Spreadsheet Files

For commonly used spreadsheet file workflows, see Read Spreadsheet Data into Table.

XML Files

(since R2021a)

Microsoft Word Documents

(since R2021b)

HTML Files

(since R2021b)

Compressed and Archived Files

(since R2025a)

File import options, specified as one of the import options objects in the table, created by either the detectImportOptions function or the associated import options function. The import options object contains properties that configure the data import process. readtable uses only the relevant properties of each import options object.

For more information on how to control your import, see Control How MATLAB Imports Your Data.

Name-Value Arguments

expand all

Specify optional pairs of arguments asName1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Example: readtable(filename,ReadVariableNames=false) specifies to read the first row of the file as data instead of variable names.

Data and Header Location

expand all

Number of header lines to skip at the beginning of the file, specified as a nonnegative integer. If you do not specify this name-value argument, readtable automatically detects the number of lines to skip.

Reading of variable names and data begins with the first nonheader line.

Data Types: single | double

Range to read from the file, specified as a string scalar, character vector, or numeric vector in one of these forms.

Ways to Specify Range Description
"Cell" or_[row col]_Starting element Specify the starting element for the data as one of these values: String scalar or character vector containing a column letter and row number using spreadsheet A1 notation. For example, A5 is the identifier for the element at the intersection of column A and row5. Two-element numeric vector of the form [row col] indicating the starting row and column.Using the starting element, readtable automatically detects the extent of the data by beginning the import at the starting element and ending at the last empty row.Example: "A5"Example: [5 1]
"Corner1:Corner2" or_[r1 c1 r2 c2]_Rectangular range Specify the rectangular range for the data as one of these values: String scalar or character vector of the form"Corner1:Corner2", where_Corner1_ and Corner2 are two opposing corners that define the region. For example, "D2:H4" represents the 3-by-5 rectangular region between the two cornersD2 and H4 in the file. TheRange name-value argument, which uses spreadsheet A1 notation, is not case sensitive.Four-element numeric vector of the form [r1 c1 r2 c2] indicating the start row, start column, end row, and end column. For example, [2 3 15 13] represents the 14-by-11 rectangular region between the 2nd and 15th rows and the 3rd and 13th columns in the file.readtable reads only the data contained in the specified range. Any empty fields within the specified range are imported as missing values.The number of columns must match the number specified in theExpectedNumVariables name-value argument.Example: "D2:H4"Example: [2 3 15 13]
"Row1:Row2"Row range Specify the beginning and ending rows using row numbers in a string scalar or character vector of the form "Row1:Row2".Using the specified row range, readtable automatically detects the column extent by reading from the first nonempty column to the end of the data, and creates one variable per column.Example: "1:7" reads all columns in rows 1 through 7 (inclusive).
"Column1:Column2"Column range Specify the beginning and ending columns using A1 notation column letters in a string scalar or character vector of the form"Column1:Column2". Using the specified column range, readtable automatically detects the row extent by reading from the first nonempty row to the end of the data.The number of columns must match the number specified in theExpectedNumVariables name-value argument.Example: "A:F" reads all rows in columns A through F (inclusive).
"NamedRange"Named range (spreadsheet only) You can create names to identify ranges in a spreadsheet. For instance, you can select a rectangular portion of the spreadsheet and call it"myTable". If a spreadsheet has a named range, thenreadtable can read that range using its name.
""Unspecified or empty If you do not specify this name-value argument, readtable automatically detects the used range.Note: Used range refers to the rectangular portion of the file that actually contains data. readtable automatically detects the used range by trimming any leading and trailing rows and columns that do not contain data. Text that is only white space is considered data and is captured within the used range.

Location of the data, specified as a string scalar, character vector, positive integer, or N-by-2 array of positive integers in one of these forms.

Ways to Specify DataRange Description
_"Cell"_Starting cell Specify the starting cell for the data as a string scalar or character vector containing a column letter and row number, using A1 notation. For example,A5 is the identifier for the cell at the intersection of columnA and row 5.Using the starting cell, readtable automatically detects the extent of the data by beginning the import at the starting cell and ending at the last empty row or footer range.Example: "A5"
_n_Starting row Specify the starting row containing the data using the positive row index.Using the specified row index, readtable automatically detects the extent of the data by reading from the specified first row to the end of the data or the footer range.Example: 5
"Corner1:Corner2"Rectangular range Specify the range using the form"Corner1:Corner2", where_Corner1_ and Corner2 are two opposing corners that define the region.readtable reads only the data contained in the specified range. Any empty fields within the specified range are imported as missing values.Example: "A5:K50"
"Row1:Row2" Row range Specify the beginning and ending rows using row numbers in a string scalar or character vector of the form"Row1:Row2".Using the specified row range, readtable automatically detects the column extent by reading from the first nonempty column to the end of the data, and creates one variable per column.Example: "5:500"
"Column1:Column2"Column range Specify the beginning and ending columns using A1 notation column letters in a string scalar or character vector of the form"Column1:Column2".Using the specified column range, readtable automatically detects the row extent by reading from the first nonempty row to the end of the data.Example: "A:K"
[n1 n2; n3 n4; ...] Multiple row ranges Specify multiple row ranges using anN-by-2 array containing N different row ranges.A valid array of multiple row ranges must: Specify line ranges in an increasing order.Contain only non-overlapping row ranges.Use of Inf is supported only for the last row range in the numeric array.Example: [1 3; 5 6; 8 Inf]
""Empty Do not read any data.Example: ""

Worksheet to read, specified as a positive integer indicating the worksheet index or a string scalar or character vector containing the worksheet name. By default,readtable reads the first sheet.

If you specify a string scalar or character vector, the worksheet name cannot contain a colon (:). To determine the names of sheets in a spreadsheet file, usesheets = sheetnames(filename). For more information, see sheetnames.

If you specify the Sheet argument in addition to [opts](readtable.html#mw%5Fda8f54b7-5bd3-4640-9eb5-715f76a3a7f0), then thereadtable function uses the specified value for Sheet, overriding the sheet name defined in the import options.

Example: 2

Example: "MySheetName"

Index of the table to read from a file containing multiple tables, specified as a positive integer. By default, readtable reads the first table.

If you specify TableIndex, the readtable function automatically sets [TableSelector](readtable.html#mw%5Ffba2c992-4db3-49ab-9727-9356cbb8ade5) to the equivalent XPath expression.

Data Types: single | double | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64

XPath expression for the table to read, specified as a string scalar or character vector. You must specify TableSelector as a valid XPath version 1.0 expression.

Selection Operation Syntax
Select every node whose name matches the node you want to select, regardless of its location in the document. Prefix the name with two forward slashes (//).
Select the value of an attribute belonging to an element node. Prefix the attribute with an at sign (@).
Select a specific node in a set of nodes. Provide the index of the node you want to select in square brackets ([]).
Specify precedence of operations. Add parentheses around the expression you want to evaluate first.

If you do not specify this name-value argument, readtable detects the table location.

Example: "//table[1]"

XML node name for the table data to read, specified as a string scalar or character vector.

Rule for cells merged across columns, specified as one of the values in this table.

Import Rule Behavior
"placeleft" Place the data in the leftmost cell and fill the remaining cells with the contents of the FillValue property based on the value ofMissingRule. If MissingRule is"fill" (default), then fill cells usingFillValue.You can set the FillValue property in the VariableImportOptions object of the variable being imported. For more information on setting the FillValue property, see setvaropts.
"placeright" Place the data in the rightmost cell and fill the remaining cells with the contents of the FillValue property based on the value ofMissingRule. If MissingRule is"fill" (default), then fill cells usingFillValue.You can set the FillValue property in the VariableImportOptions object of the variable being imported. For more information on setting the FillValue property, see setvaropts.
"duplicate" Duplicate the data in all cells.
"omitrow" Omit rows where merged cells occur.
"error" Display an error message and cancel the import operation.

Rule for cells merged across rows, specified as one of the values in this table.

Import Rule Behavior
"placetop" Place the data in the top cell and fill the remaining cells with the contents of the FillValue property based on the value ofMissingRule. If MissingRule is"fill" (default), then fill cells usingFillValue.You can set the FillValue property in the VariableImportOptions object of the variable being imported. For more information on setting the FillValue property, see setvaropts.
"placebottom" Place the data in the bottom cell and fill the remaining cells with the contents of the FillValue property based on the value ofMissingRule. If MissingRule is"fill" (default), then fill cells usingFillValue. You can set theFillValue property in the VariableImportOptions object of the variable being imported. For more information on setting theFillValue property, see setvaropts.
"duplicate" Duplicate the data in all cells.
"omitvar" Omit variables where merged cells occur.
"error" Display an error message and cancel the import operation.

Variables

expand all

Whether to read variable names, specified as a numeric or logical 1 (true) or 0 (false). If you do not specify this name-value argument, readtable automatically detects the presence of variable names after header rows.

Indicator Description
true Read variable names.If you also specify an import options object opts, readtable uses theVariableNamesRange or VariableNamesLine property of opts to identify the location of variable names.
false Do not read variable names. Create default variable names of the form"Var1",...,"VarN", where N is the number of variables.If you also specify an import options objectopts, readtable uses theVariableNames property of opts to create variable names.
Unspecified Automatically detect whether the region contains variable names.

If both ReadVariableNames and ReadRowNames aretrue, then readtable saves the name in the first column of the first row of the region to read as the first dimension name in the property,T.Properties.DimensionNames.

Rule for variable names, specified as one of these values:

Variable and row names do not have to be valid MATLAB identifiers. They can include any characters, including spaces and non-ASCII characters. Also, they can start with any character, not just letters.

Expected number of variables, specified as a nonnegative integer. If you do not specify this name-value argument, readtable automatically detects the number of variables.

Location of variable names, specified as a nonnegative integer.

If VariableNamesLine is 0, thenreadtable does not import variable names. Otherwise,readtable imports the variable names from the specified line.

If variable names exist, and both VariableNamesLine andReadVariableNames are unspecified, readtable detects which line contains variable names and imports them.

Data Types: single | double | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64

Location of variable names, specified as a string scalar, character vector, or positive integer in one of these forms.

Ways to Specify VariableNamesRange Description
_"Cell"_Starting cell Specify the starting cell for the variable names as a string scalar or character vector containing a column letter and row number, using A1 notation.Example: "A5" identifies the cell at the intersection of columnA and row 5.
"Corner1:Corner2"Rectangular range Specify the range using the form"Corner1:Corner2", where_Corner1_ and Corner2 are two opposing corners that define the region for variable names.The range must span only one row.Example: "A5:K5"
_n_Number index Specify the row containing the variable names using a positive row index.Example: 5
"Row1:Row2"Row range Specify the range using the form"Row1:Row2", where_Row1_ and_Row2_ are the same row index.Variable names must be in a single row.Example: "5:5"
""Unspecified or empty Indicate that there are no variable names.Example: ""

Data Types: string | char | single | double

Location of variable names, specified as a nonnegative integer.

If VariableNamesRow is 0, thenreadtable does not import variable names. Otherwise,readtable imports the variable names from the specified row.

If you do not specify VariableNamesRow, andReadVariableNames is true (default), thenreadtable imports variable names. If both are unspecified, readtable detects if a row contains variable names to import.

Data Types: single | double | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64

Names of XML nodes to read as table variables, specified as a string array or cell array of character vectors. If nested nodes have the same name,VariableNodeNames selects the nodes at the top level.

Example: ["XMLNodeName1","XMLNodeName2"]

XPath expressions for variables to read, specified as a string array or cell array of character vectors. You must specify VariableSelectors as valid XPath version 1.0 expressions.

Use XPath expressions to specify which elements of the XML input file to import. For example, suppose you want to import the XML filemyFile.xml, which has this structure:

12
1020
Selection Operation Syntax Example
Select every node whose name matches the node you want to select, regardless of its location in the document. Prefix the name with two forward slashes (//). Select every node namedvar.T = readtable("myFile.xml",VariableSelectors="//var")
Select the value of an attribute belonging to an element node. Prefix the attribute with an at sign (@). Select the value of the category attribute of thetable node.T = readtable("myFile.xml",VariableSelectors="//table/@category")
Select a specific node in a set of nodes. Provide the index of the node you want to select in square brackets ([]). Select the first var node of eachtable node.T = readtable("myFile.xml",VariableSelectors="//table/var[1]")
Specify precedence of operations. Add parentheses around the expression you want to evaluate first. Select the first var node of eachtable node.T = readtable("myFile.xml",VariableSelectors="//table/var[1]")
Select the first var node of the firsttable node.T = readtable("myFile.xml",VariableSelectors="(//table/var)[1]")

If you do not specify this name-value argument, readtable detects the location of variables.

Whether to concatenate data, specified as a numeric or logical 1 (true) or 0 (false). IfCollectOutput is true, thenreadtable concatenates consecutive output cells of the same fundamental MATLAB class into a single array.

Rows

expand all

Whether to read the first column as row names, specified as a numeric or logical1 (true) or 0 (false).

Value Description
true Read row names from the first column of the region to read. If you also specify an import options object opts,readtable uses the RowNamesColumn,RowNamesRange, or RowNamesSelector property ofopts to identify the location of row names.
false Read data from the first column of the region and do not create row names.

If both ReadVariableNames and ReadRowNames aretrue, then readtable saves the name in the first column of the first row of the region to read as the first dimension name in the property,T.Properties.DimensionNames.

Location of row names, specified as a nonnegative integer.

If RowNamesColumn is 0, thenreadtable does not import row names. Otherwise,readtable imports row names from the specified column.

If you do not specify RowNamesColumn, andReadRowNames is true, readtable imports the first column as the row names.

Location of row names, specified as a string scalar, character vector, or positive integer in one of these forms.

Ways to Specify RowNamesRange Description
_"Cell"_Starting cell Specify the starting cell for the row names as a string scalar or character vector containing a column letter and row number, using A1 notation.From the starting cell, readtable identifies a name for each row in the data.Example: "A5" identifies the cell at the intersection of columnA and row 5.
"Corner1:Corner2"Rectangular range Specify the range using the form"Corner1:Corner2", where_Corner1_ and Corner2 are two opposing corners that define the region for row names.The number of rows must match the number of data rows, and the range must span only one column.Example: "A5:A50"
"Column1:Column1"Column range Specify the range using the form"Column1:Column1", where_Column1_ and_Column2_ are the same column letter.Row names must be in a single column.Example: "A:A"
_n_Number index Specify the column containing the row names using a positive column index.Example: 5
""Unspecified or empty Indicate that there are no row names.Example: ""

Data Types: string | char | single | double

XPath expression for row names, specified as a string scalar or character vector. You must specify RowNamesSelector as a valid XPath version 1.0 expression.

If you do not specify this name-value argument, readtable does not import row names.

Example: "/RootNode/ChildNode"

XPath expression for selecting individual rows from a table, specified as a string scalar or character vector. You must specify RowSelector as a valid XPath version 1.0 expression.

If you do not specify this name-value argument, readtable detects the location of rows.

Example: "/RootNode/ChildNode"

XML nodes specifying rows, specified as a string scalar or character vector.

Data Types

expand all

Type for imported text data, specified as one of these values:

Type for imported date and time data, specified as one of the values in this table.

Value Resulting Data Type
"datetime" MATLABdatetime data type
"text" The data type depends on the value of TextType. If TextType is "char", then dates are a cell array of character vectors.If TextType is "string", then dates are a string array.
"exceldatenum" Excel serial date numbersThis value is valid only for spreadsheet files.A serial date number is a single number equal to the number of days from a given reference date. Excel serial date numbers use a different reference date than MATLAB serial date numbers. For more information on Excel dates, see Differences between the 1900 and the 1904 date system in Excel.

Type for imported duration data, specified as one of the values in this table.

Value Resulting Data Type
"duration" MATLABduration data type
"text" The data type depends on the value of TextType. If TextType is "char", then duration data is a cell array of character vectors. If TextType is "string", then duration data is a string array.

Type for imported hexadecimal data, specified as one of the values in this table.

Value Resulting Data Type
"auto" Detected data type; readtable determines the smallest integer type that can represent all variable values.
"text" Unaltered input text
"int8" 8-bit integer, signed
"int16" 16-bit integer, signed
"int32" 32-bit integer, signed
"int64" 64-bit integer, signed
"uint8" 8-bit integer, unsigned
"uint16" 16-bit integer, unsigned
"uint32" 32-bit integer, unsigned
"uint64" 64-bit integer, unsigned

The input file can represent hexadecimal values as text, using either0x or 0X as a prefix and the characters0-9, a-f, andA-F as digits. Uppercase and lowercase letters represent the same digits—for example, "0xf" and "0xF" both represent 15.

Type for imported binary data, specified as one of the values in this table.

Value Resulting Data Type
"auto" Detected data type; readtable determines the smallest integer type that can represent all variable values.
"text" Unaltered input text
"int8" 8-bit integer, signed
"int16" 16-bit integer, signed
"int32" 32-bit integer, signed
"int64" 64-bit integer, signed
"uint8" 8-bit integer, unsigned
"uint16" 16-bit integer, unsigned
"uint32" 32-bit integer, unsigned
"uint64" 64-bit integer, unsigned

The input file can represent binary values as text, using either 0b or 0B as a prefix and the characters 0 and1 as digits. For example, 0b11111111 represents 255.

Locale for reading dates, specified as a string scalar or character vector of the form_`xx`__ _`YY`_, where:

Use DateLocale to specify the locale in whichreadtable interprets month and day-of-week names and abbreviations.

This table lists some common values for the locale.

Locale Language Country
"de_DE" German Germany
"en_GB" English United Kingdom
"en_US" English United States
"es_ES" Spanish Spain
"fr_FR" French France
"it_IT" Italian Italy
"ja_JP" Japanese Japan
"ko_KR" Korean Korea
"nl_NL" Dutch Netherlands
"zh_CN" Chinese (simplified) China

If you specify the DateLocale argument in addition to an import options object opts, the argument value overrides the locale defined in the import options.

Whether to remove nonnumeric characters from a numeric variable, specified as a numeric or logical 1 (true) or 0 (false). For example, if TrimNonNumeric istrue, then readtable reads "$500/-" as 500.

Decimal separator character in numeric variables, specified as a string scalar or single-character character vector. The separator character distinguishes the integer part of a number from the decimal part. For example, if the separator is ",", thenreadtable imports the text "3,14159" as the number3.14159.

When converting to integer data types, readtable rounds numbers with a decimal part to the nearest integer. DecimalSeparator does not accept numeric digits as values.

Thousands grouping character in numeric variables, specified as a string scalar or character vector. The grouping character acts as a visual separator, grouping a number at every three place values. For example, if the grouping character is ",", then readtable imports the text "1,234,000" as1234000.

Exponent characters, specified as a string scalar or character vector. The default exponent characters are e, E, d, andD.

Example: "eE"

Data Cleaning

expand all

Placeholder text to treat as missing value, specified as a string array, character vector, or cell array of character vectors. readtable imports table elements corresponding to this placeholder text as the missing value associated with the data type of the element.

Example: "N/A"

Example: [".","NA","N/A"]

Rule for import errors, specified as one of the values in this table. An import error occurs when readtable is unable to convert a text element to the expected data type.

Import Error Rule Behavior
"fill" Replace the data where the error occurred with the contents of theFillValue property.You can set theFillValue property in the VariableImportOptions object of the variable being imported. For more information on setting theFillValue property, see setvaropts.
"error" Display an error message and cancel the import operation.
"omitrow" Omit rows where errors occur.
"omitvar" Omit variables where errors occur.

Rule for missing data, specified as one of the values in this table.

Missing Rule Behavior
"fill" Replace missing data with the contents of the FillValue property.You can set the FillValue property in theVariableImportOptions object of the variable being imported. For more information on setting the FillValue property, see setvaropts.
"error" Display an error message and cancel the import operation.
"omitrow" Omit rows that contain missing data.
"omitvar" Omit variables that contain missing data.

For text files, data is missing if an expected field in a row does not exist. Because missing fields cause subsequent elements of a row to shift fields, the missing fields are interpreted at the end of the row.

For spreadsheet files, data is missing if the expected field in a row has no data and the field type is blank or empty.

For XML files, data is missing if an expected node does not exist.

For Microsoft Word files, data is missing if an expected field in a row does not exist.

For HTML files, data is missing if an expected field in a row does not exist.

Rule for extra columns in the data, specified as one of the values in this table.readtable considers columns to be extra if a row has more columns than expected.

Extra Columns Rule Behavior
"addvars" To import extra columns, create new variables. If there areN extra columns, then import new variables as"ExtraVar1","ExtraVar2",...,"ExtraVarN".readtable imports the extra columns as text of data type char.
"ignore" Ignore the extra columns of data.
"wrap" Wrap the extra columns of data to new records. This action does not change the number of variables.
"error" Display an error message and cancel the import operation.

Returned value for empty numeric fields in delimited text files, specified asNaN or a numeric scalar.

Example: 0

Rule for empty lines in the data, specified as one of the values in this table.readtable considers a line to be empty if it contains only white-space characters.

Empty Line Rule Behavior
"skip" Skip the empty lines.
"read" Import the empty lines. readtable parses an empty line using the values specified in VariableWidths,VariableOptions, MissingRule, and other relevant arguments, such as Whitespace.
"error" Display an error message and cancel the import operation.

Rule for empty rows in the data, specified as one of the values in this table.

Empty Line Rule Behavior
"skip" Skip the empty rows.
"read" Import the empty rows. readtable parses an empty row using the values specified in VariableWidths,VariableOptions, MissingRule, and other relevant arguments, such as Whitespace.
"error" Display an error message and cancel the import operation.

Rule for empty columns in the data, specified as one of the values in this table.

Empty Column Rule Behavior
"skip" Skip the empty columns.
"read" Import the empty columns. readtable parses an empty column using the values specified in VariableWidths,VariableOptions, MissingRule, and other relevant arguments, such as Whitespace.
"error" Display an error message and cancel the import operation.

Rule for partial fields in the data, specified as one of the values in this table.readtable considers a field to be partially filled if it reaches the end of a line in fewer characters than the expected width. This name-value argument applies only to fields with fixed widths.

Partial Field Rule Behavior
"keep" Keep the partial field data and convert the text to the appropriate data type.If readtable is unable to interpret the partial data, a conversion error can occur.
"fill" Replace missing data with the contents of the FillValue property.You can set the FillValue property in theVariableImportOptions object of the variable being imported. For more information on setting the FillValue property, see setvaropts.
"omitrow" Omit rows that contain partial data.
"omitvar" Omit variables that contain partial data.
"wrap" Begin reading the next line of characters.
"error" Display an error message and cancel the import operation.

File Information

expand all

Type of file, specified as one of the values in this table.

Value File Type
"spreadsheet" Spreadsheet files
"text" Text files
"delimitedtext" Delimited text files
"fixedwidth" Fixed-width text files
"xml" XML files
"worddocument" Microsoft Word documents
"html" HTML files

Specify this name-value argument when [filename](readtable.html#mw%5Fb0826a11-ea6e-45f6-92ce-54960958581f) does not include the file extension or when its extension is not in this list:

Character encoding scheme associated with the file, specified as"system" or a standard character encoding scheme name. When you do not specify any encoding, the readtable function uses automatic character set detection to determine the encoding when reading the file.

If you specify the Encoding argument in addition to an import options object opts, the argument value overrides the encoding defined in the import options.

Whether to start an instance of Microsoft Excel for Windows when reading spreadsheet data, specified as a numeric or logical1 (true) or 0 (false).

Based on this argument, readtable supports different file formats and interactive features, such as formulas and macros.

Support If UseExcel istrue If UseExcel isfalse
Supported file formats .xls, .xlsb, .xlsm,.xlsx, .xltm, .xltx,.ods .xls, .xlsm, .xlsx,.xltm, .xltx
Support for interactive features, such as formulas and macros Yes No

UseExcel is not supported in noninteractive, automated environments.

Since R2022a

HTTP or HTTPS request options, specified as aweboptions object. The weboptions object determines how to import data when the specified filename is an internet URL containing the protocol type "http://" or "https://".

Text Parsing

expand all

Field delimiter character, specified as a string array, character vector, or cell array of character vectors. Specify Delimiter as any valid character such as a comma "," or a period ".".

This table lists some commonly used field delimiter characters.

Specifier Field Delimiter
",""comma" Comma
" ""space" Space
"\t""tab" Tab
";""semi" Semicolon
"|""bar" Vertical bar
Unspecified If you do not specify this name-value argument, readtable automatically detects the delimiter.

To treat multiple characters as a single delimiter, specify Delimiter as a string array or cell array of character vectors. If you want to treat an unknown number of consecutive delimiters as one, specifyConsecutiveDelimitersRule="join".

Delimiter is valid only with delimited text files and is not valid with fixed-width text files.

End-of-line characters, specified as a string array, character vector, or cell array of character vectors. Common end-of-line characters include the newline character ("\n") and the carriage return ("\r"). If you specify"\r\n", then readtable treats the combination of the two (\r\n) as end-of-line characters. If you specify {"\r\n", "\r", "\n"}, then \r, \n, and\r\n are all treated as end-of-line characters.

The default end-of-line sequence is \n, \r, or\r\n, depending on the contents of your file.

Characters to treat as white space, specified as a string scalar or character vector containing one or more characters.

This table shows how to represent special characters that you cannot enter using ordinary text.

Special Character Representation
Percent %%
Backslash \\
Alarm \a
Backspace \b
Form feed \f
New line \n
Carriage return \r
Horizontal tab \t
Vertical tab \v
Character whose Unicode® numeric value can be represented by the hexadecimal number,N \xN
Character whose Unicode numeric value can be represented by the octal number,N \N

Example: " _"

Example: "?!.,"

Comment indicators for text to ignore, specified as a string array, character vector, or cell array of character vectors.

For example, specify a character, such as "%", to ignore text following that character on the same line. Specify a string array, such as["/*","*/"], to ignore any text between sequences.

readtable checks for comments only at the start of each line, not within lines.

Example: ["/*","*/"]

Rule for leading delimiters in a delimited text file, specified as one of the values in this table.

Rule Behavior
"keep" Keep the delimiter.
"ignore" Ignore the delimiter.
"error" Display an error message and cancel the import operation.

Rule for trailing delimiters in a delimited text file, specified as one of the values in this table.

Rule Behavior
"keep" Keep the delimiter.
"ignore" Ignore the delimiter.
"error" Display an error message and cancel the import operation.

Rule for consecutive delimiters in a delimited text file, specified as one of the values in this table.

Rule Behavior
"split" Split the consecutive delimiters into multiple fields.
"join" Join the delimiters into one delimiter.
"error" Display an error message and cancel the import operation.

Whether to treat multiple delimiters as one, specified as a numeric or logical1 (true) or 0 (false).

Field widths of variables in a fixed-width text file, specified as a vector of positive integers. Each integer corresponds to the number of characters in a field that make up the variable.

Example: [10,7,4,26,7]

Column format of the file, specified as a string scalar or character vector containing one or more conversion specifiers, or "auto". The conversion specifiers are the same as the specifiers accepted by the textscan function.

Specifying the format can significantly improve the import speed for some large files.

If you do not specify Format, then readtable behaves as though you have used the results of the detectImportOptions function to import the data. For more information on the consequences of this behavior, see the R2020a Version History note.

If you set Format to "auto", then the variables created are double arrays, cell arrays of character vectors, ordatetime arrays, depending on the data. If an entire column is numeric, the resulting variable is of type double. If any element in a column is not numeric, the resulting variable is a cell array of character vectors or adatetime array if the values represent dates and times.

XML Parsing

expand all

Whether to import XML attributes as variables in the output table, specified as a numeric or logical 1 (true) or 0 (false). By default, readtable imports XML attributes as variables in the output table.

Suffix used to distinguish attributes from elements in the output table, specified as a string scalar or character vector. This argument specifies the suffixreadtable appends to all table variables that correspond to attributes in the input XML file. If you do not specify AttributeSuffix, thenreadtable appends the suffix "Attribute" to all variable names corresponding to attributes in the input XML file.

Example: "_att"

Rule for repeated XML nodes in a given row of a table, specified as one of the values in this table.

Rule Behavior
"addcol" Add columns for each repeated node in a variable to create a matrix in the associated variable. "addcol" does not create a separate variable in the table for the repeated node. For example:Input data 1 2 3 11 111 4 5 6 7 8 9
Output table Var1 Var2 Var3 _______________ ____ ____ 1 11 111 2 3 4 NaN NaN 5 6 7 NaN NaN 8 9
"ignore" Skip the repeated nodes.
"error" Display an error message and cancel the import operation.

Set of registered XML namespace prefixes, specified as an N-by-2 string array of prefixes and their associated URLs. readtable uses these prefixes when evaluating XPath expressions on an XML file.

You can use RegisteredNamespaces when you also evaluate an XPath expression specified by a selector name-value argument, such asVariableSelectors.

By default, readtable automatically detects namespace prefixes to use in XPath evaluation. To select an XML node with an undeclared namespace prefix, register a custom namespace URL for the namespace prefix using theRegisteredNamespaces name-value argument. For example, assign the prefixmyprefix to the URL https://www.mathworks.com in an XML file that does not contain a namespace prefix.

T = readtable(filename,VariableSelectors="/myprefix:Data", ... RegisteredNamespaces=["myprefix","https://www.mathworks.com"])

Variable Metadata

expand all

Location of variable units, specified as a nonnegative integer.

If VariableUnitsLine is 0, thenreadtable does not import variable units. Otherwise,readtable imports the variable units from the specified line.

Data Types: single | double | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64

Location of variable units, specified as a string scalar, character vector, or positive integer in one of these forms.

Ways to Specify VariableUnitsRange Description
_"Cell"_Starting cell Specify the starting cell for the variable units as a string scalar or character vector containing a column letter and row number, using A1 notation.From the starting cell, readtable identifies a unit for each variable in the data.Example: "A5" identifies the cell at the intersection of columnA and row 5.
"Corner1:Corner2"Rectangular range Specify the range using the form"Corner1:Corner2", where_Corner1_ and Corner2 are two opposing corners that define the region for variable units.The range must span only one row.Example: "A5:K5"
_n_Number index Specify the row containing the variable units using a positive row index.Example: 5
"Row1:Row2"Row range Specify the range using the form"Row1:Row2" where_Row1_ and_Row2_ are the same row index.Variable units must be in a single row.Example: "5:5"
""Unspecified or empty Indicate that there are no variable units.Example: ""

Data Types: string | char | single | double

Location of variable units, specified as a nonnegative integer.

If VariableUnitsRow is 0, thenreadtable does not import variable units. Otherwise,readtable imports the variable units from the specified row.

Data Types: single | double | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64

XPath expression for variable units, specified as a string scalar or character vector that readtable uses to select the table variable units. You must specifyVariableUnitsSelector as a valid XPath version 1.0 expression.

If you do not specify this name-value argument, readtable does not import variable units.

Example: "/RootNode/ChildNode"

Example: "//table[1]/units/"

Location of variable descriptions, specified as a nonnegative integer.

If VariableDescriptionsLine is 0, thenreadtable does not import variable descriptions. Otherwise,readtable imports the variable descriptions from the specified line.

Data Types: single | double | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64

Location of variable descriptions, specified as a string scalar, character vector, or positive integer in one of these forms.

Ways to Specify VariableDescriptionsRange Description
_"Cell"_Starting cell Specify the starting cell for the variable descriptions as a string scalar or character vector containing a column letter and row number, using A1 notation.From the starting cell, readtable identifies a description for each variable in the data.Example: "A5" identifies the cell at the intersection of columnA and row 5.
"Corner1:Corner2"Rectangular range Specify the range using the form"Corner1:Corner2", where_Corner1_ and Corner2 are two opposing corners that define the region for variable descriptions.The range must span only one row.Example: "A5:K5"
"Row1:Row2"Row range Specify the range using the form"Row1:Row2" where_Row1_ and_Row2_ are the same row index.Variable descriptions must be in a single row.Example: "5:5"
_n_Number index Specify the row containing the descriptions using a positive row index.Example: 5
""Unspecified or empty Indicate that there are no variable descriptions.Example: ""

Data Types: string | char | single | double

Location of variable descriptions, specified as a nonnegative integer.

If VariableDescriptionsRow is 0, thenreadtable does not import variable descriptions. Otherwise,readtable imports the variable descriptions from the specified row.

Data Types: single | double | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64

XPath expression for variable descriptions, specified as a string scalar or character vector. You must specify VariableDescriptionsSelector as a valid XPath version 1.0 expression.

If you do not specify this name-value argument, readtable does not import variable descriptions.

Example: "/RootNode/RowNode/@Name"

Example: "//table[1]/descriptions/*"

Output Arguments

collapse all

Output table, returned as a table. The table can store metadata such as descriptions, variable units, variable names, and row names. For more information, see the Properties section of table.

Extended Capabilities

expand all

Usage notes and limitations:

Version History

Introduced in R2013b

expand all

You can read data from compressed and archived files as a table.

When importing data from spreadsheets, you can specify how readtable imports cells that are merged across rows and columns by using theMergedCellRowRule and MergedCellColumnRule name-value arguments.

Read XML data as a table with optional name-value arguments to control the import behavior. For example, set ImportAttributes to false to ignore attribute nodes.

Additionally, you can read all supported file types from an internet URL. Specify the filename as a string that includes the protocol type http:// orhttps://.

The PreserveVariableNames argument is not recommended. UseVariableNamingRule instead. Specify one of the following values:

There are no plans to remove support for PreserveVariableNames.

By default, the readtable function uses the results of thedetectImportOptions function to import tabular data. In essence, these two function calls behave identically.

T = readtable(filename) T = readtable(filename,detectImportOptions(filename))

This change results in several behavior differences.

Description of Input New Default Behavior Previous Default Behavior
First row does not have text to assign as names of output table variables. Assign the names Var1,...,VarN as the names of output table variables. Convert the values in the first row of data values to the names of output table variables.
Multiple rows of text as header lines Ignore additional header lines.Import values in remaining rows as detected data types. Import additional header lines as text in the first rows of output table.Import values in remaining rows as text.
Empty fields Treat as missing values for detected data type. Treat as empty character vectors or strings.
Values in quotes Treat as detected data type. Treat as text.
Text that cannot be converted Treat as missing values for detected data type. Treat as text.
Nonnumeric character trails numeric character without delimiter between them. Treat characters as nonnumeric. Treat numeric and nonnumeric characters as though delimiter separated them.
Input text file has lines with different number of delimiters. Return output table with extra variables. Display error message.

To restore the previous behavior, set Format to"auto".

T = readtable(filename,Format="auto")

Text that is prefixed with the characters '0x' is now treated as hexadecimal data and text with the prefix '0b' is treated as binary data.

Additionally, when reading text data, readtable now uses automatic character set detection.

The default setting for UseExcel on Windows systems with Excel installed isfalse. To preserve the previous behavior, set UseExcel totrue.

T = readtable(filename,UseExcel=true)