strsplit - Split string or character vector at specified delimiter - MATLAB (original) (raw)
Split string or character vector at specified delimiter
Syntax
Description
Note
split is recommended over strsplit
because it provides greater flexibility and allows vectorization. For additional information, see Alternative Functionality.
[C](#btnav5t-C) = strsplit([str](#btnav5t-str))
splits str
at whitespace into C
. A whitespace character is equivalent to any sequence in the set {' ','\f','\n','\r','\t','\v'}
.
If str
has consecutive whitespace characters, thenstrsplit
treats them as one whitespace.
[C](#btnav5t-C) = strsplit([str](#btnav5t-str),[delimiter](#btnav5t-delimiter))
splits str
at the delimiters specified by delimiter
.
If str
has consecutive delimiters, with no other characters between them, then strsplit
treats them as one delimiter. For example, both strsplit('Hello,world',',')
andstrsplit('Hello,,,world',',')
return the same output.
[C](#btnav5t-C) = strsplit([str](#btnav5t-str),[delimiter](#btnav5t-delimiter),[Name,Value](#namevaluepairarguments))
specifies additional delimiter options using one or more name-value pair arguments. For example, to treat consecutive delimiters as separate delimiters, you can specify 'CollapseDelimiters',false
.
[[C](#btnav5t-C),[matches](#btnav5t-matches)] = strsplit(___)
additionally returns the array, matches
. Thematches
output argument contains all occurrences of delimiters upon which strsplit
splits str
. You can use this syntax with any of the input arguments of the previous syntaxes.
Examples
Split Character Vector on Whitespace
str = 'The rain in Spain.'; C = strsplit(str)
C = 1x4 cell {'The'} {'rain'} {'in'} {'Spain.'}
C
is a cell array containing four character vectors.
Split Character Vector of Values on Specific Delimiter
Split a character vector that contains comma-separated values.
data = '1.21, 1.985, 1.955, 2.015, 1.885'; C = strsplit(data,', ')
C = 1x5 cell {'1.21'} {'1.985'} {'1.955'} {'2.015'} {'1.885'}
Split a character vector, data
, which contains the units m/s
with an arbitrary number of whitespace on either side of the text. The regular expression, \s*
, matches any whitespace character appearing zero or more times.
data = '1.21m/s1.985m/s 1.955 m/s2.015 m/s 1.885m/s'; [C,matches] = strsplit(data,'\sm/s\s',... 'DelimiterType','RegularExpression')
C = 1x6 cell {'1.21'} {'1.985'} {'1.955'} {'2.015'} {'1.885'} {0x0 char}
matches = 1x5 cell {'m/s'} {'m/s '} {' m/s'} {' m/s '} {'m/s'}
In this case, the last character vector in C
is empty. This empty character vector follows the last matched delimiter.
Split Path on File Separator
myPath = 'C:\work\matlab'; C = strsplit(myPath,'')
C = 1x3 cell {'C:'} {'work'} {'matlab'}
Split Character Vector with Multiple Delimiters
Split a character vector on ' '
and 'ain'
, treating multiple delimiters as one. Specify multiple delimiters in a cell array of character vectors.
str = 'The rain in Spain stays mainly in the plain.'; [C,matches] = strsplit(str,{' ','ain'},'CollapseDelimiters',true)
C = 1x11 cell {'The'} {'r'} {'in'} {'Sp'} {'stays'} {'m'} {'ly'} {'in'} {'the'} {'pl'} {'.'}
matches = 1x10 cell {' '} {'ain '} {' '} {'ain '} {' '} {'ain'} {' '} {' '} {' '} {'ain'}
Split the same character vector on whitespace and on 'ain'
, using regular expressions and treating multiple delimiters separately.
[C,matches] = strsplit(str,{'\s','ain'},'CollapseDelimiters',... false, 'DelimiterType','RegularExpression')
C = 1x13 cell {'The'} {'r'} {0x0 char} {'in'} {'Sp'} {0x0 char} {'stays'} {'m'} {'ly'} {'in'} {'the'} {'pl'} {'.'}
matches = 1x12 cell {' '} {'ain'} {' '} {' '} {'ain'} {' '} {' '} {'ain'} {' '} {' '} {' '} {'ain'}
In this case, strsplit
treats the two delimiters separately, so empty character vectors appear in output C
between the consecutively matched delimiters.
Split Text with Multiple, Overlapping Delimiters
Split text on the character vectors ', '
and ', and '
.
str = 'bacon, lettuce, and tomato'; [C,matches] = strsplit(str,{', ',', and '})
C = 1x3 cell {'bacon'} {'lettuce'} {'and tomato'}
matches = 1x2 cell {', '} {', '}
Because the command lists ', '
first and ', and '
contains ', '
, the strsplit
function splits str
on the first delimiter and never proceeds to the second delimiter.
If you reverse the order of delimiters, ', and '
takes priority.
str = 'bacon, lettuce, and tomato'; [C,matches] = strsplit(str,{', and ',', '})
C = 1x3 cell {'bacon'} {'lettuce'} {'tomato'}
matches = 1x2 cell {', '} {', and '}
Input Arguments
str
— Input text
character vector | string scalar
Input text, specified as a character vector or a string scalar.
Data Types: char
| string
delimiter
— Delimiting characters
character vector | 1
-by-n
cell array of character vectors | 1
-by-n
string array
Delimiting characters, specified as a character vector, a1
-by-n
cell array of character vectors, or a 1
-by-n
string array. Text specified in delimiter
does not appear in the output C.
Specify multiple delimiters in a cell array or a string array. The strsplit
function splits str on the elements of delimiter
. The order in which delimiters appear in delimiter
does not matter unless multiple delimiters begin a match at the same character in str
. In that case strsplit
splits on the first matching delimiter in delimiter
.
delimiter
can include the following escape sequences:
\\ | Backslash |
---|---|
\0 | Null |
\a | Alarm |
\b | Backspace |
\f | Form feed |
\n | New line |
\r | Carriage return |
\t | Horizontal tab |
\v | Vertical tab |
Example: ','
Example: {'-',','}
Data Types: char
| cell
| string
Name-Value Arguments
Specify optional pairs of arguments asName1=Value1,...,NameN=ValueN
, where Name
is the argument name and Value
is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.
Before R2021a, use commas to separate each name and value, and enclose Name
in quotes.
Example: 'DelimiterType','RegularExpression'
instructs strsplit
to treat delimiter
as a regular expression.
CollapseDelimiters
— Multiple delimiter handling
1 (true)
(default) | 0 (false)
Multiple delimiter handling, specified as the comma-separated pair consisting of 'CollapseDelimiters'
and either true
or false
. If true
, then consecutive delimiters in str
are treated as one. If false
, then consecutive delimiters are treated as separate delimiters, resulting in empty character vector ''
elements between matched delimiters.
Example: 'CollapseDelimiters',true
DelimiterType
— Delimiter type
'Simple'
(default) | 'RegularExpression'
Delimiter type, specified as the comma-separated pair consisting of 'DelimiterType'
and one of the following character vectors.
'Simple' | Except for escape sequences, strsplit treatsdelimiter as literal text. |
---|---|
'RegularExpression' | strsplit treats delimiter as a regular expression. |
In both cases, delimiter
can include escape sequences.
Output Arguments
C
— Parts of original text
cell array of character vectors | string array
Parts of the original character vector, returned as a cell array of character vectors or as a string array. C
always contains one more element than matches contains. Therefore, if str
begins with a delimiter, then the first element of C
contains no characters. If str
ends with a delimiter, then the last cell in C
contains no characters.
matches
— Identified delimiters
cell array of character vectors | string array
Identified delimiters, returned as a cell array of character vectors or as a string array.matches
always contains one less element than output C contains. If str
is a character vector or a cell array of character vectors, thenmatches
is a cell array. If str
is a string array, then matches
is a string array.
Alternative Functionality
Update code that makes use of strsplit
to use split instead. The default orientation for split
is by column. For example:
Not Recommended | Recommended |
---|---|
str = strsplit("1 2 3") str = 1×3 string array "1" "2" "3" | str = split("1 2 3") str = 3×1 string array "1" "2" "3" |
Extended Capabilities
Thread-Based Environment
Run code in the background using MATLAB® backgroundPool
or accelerate code with Parallel Computing Toolbox™ ThreadPool
.
This function fully supports thread-based environments. For more information, see Run MATLAB Functions in Thread-Based Environment.
Version History
Introduced in R2013a