extract - Extract substrings from strings - MATLAB (original) (raw)
Extract substrings from strings
Since R2020b
Syntax
Description
[newStr](#mw%5F92b33e21-ed9e-4f0e-9c4f-c1a36c6631a8) = extract([str](#mw%5F6f9c458d-4360-47aa-b8c8-fc07eb1070b0%5Fsep%5Fbu4l86d-str),[pat](#mw%5F6b932a41-2bad-4590-9a16-939009af1565))
returns any substrings in str
that match the pattern specified bypat
.
If str
is a string array or a cell array of character vectors, then the function extracts substrings from each element of str
. Ifpat
is an array, then the function matches against multiple patterns.
[newStr](#mw%5F92b33e21-ed9e-4f0e-9c4f-c1a36c6631a8) = extract([str](#mw%5F6f9c458d-4360-47aa-b8c8-fc07eb1070b0%5Fsep%5Fbu4l86d-str),[pos](#mw%5Fbcbf7a33-8948-4317-9107-24f3ab371c63))
returns the character in str
at the position specified bypos
.
Examples
Extract ZIP Codes from Addresses
Create a string array that contains addresses. Each address ends with a US ZIP code.
str = ["73 Beacon St., Boston, MA, 02116"; "1640 Riverside Dr., Hill Valley, CA, 92530"; "138 Main St., Cambridge, MA, 02138"]
str = 3x1 string "73 Beacon St., Boston, MA, 02116" "1640 Riverside Dr., Hill Valley, CA, 92530" "138 Main St., Cambridge, MA, 02138"
Create a pattern that matches any sequence of digits.
pat = pattern Matching:
digitsPattern
Use it to extract all sequences of digits from the addresses.
newStr = extract(str,pat)
newStr = 3x2 string "73" "02116" "1640" "92530" "138" "02138"
The digitsPattern
pattern matches street numbers, apartment numbers, and ZIP codes. To match only ZIP codes, create a pattern that matches a sequence of digits at the end of an address.
pat = digitsPattern + textBoundary
pat = pattern Matching:
digitsPattern + textBoundary
Extract the ZIP codes.
newStr = extract(str,pat)
newStr = 3x1 string "02116" "92530" "02138"
For a list of functions that create pattern objects, see pattern.
Extract Character at Numeric Position
Create a string.
str = "All's well that ends well"
str = "All's well that ends well"
Extract the first character in the string.
Extract the last character.
extract(str,strlength(str))
Input Arguments
str
— Input text
string array | character vector | cell array of character vectors
Input text, specified as a string array, character vector, or cell array of character vectors.
pat
— Search pattern
string array | character vector | cell array of character vectors | pattern
array
Search pattern, specified as one of the following:
- String array
- Character vector
- Cell array of character vectors
- pattern array
pos
— Position
numeric array
Position, specified as a numeric array.
If str
is a string array or cell array of character vectors, thenpos
can be a numeric scalar or numeric array of the same size asstr
.
Output Arguments
newStr
— Output text
string array | cell array of character vectors
Output text, returned as a string array or cell array of character vectors.
If str
is a string array, then newStr
is also a string array. Otherwise, newStr
is a cell array of character vectors.
Extended Capabilities
Tall Arrays
Calculate with arrays that have more rows than fit in memory.
Theextract
function supports tall arrays with the following usage notes and limitations:
- If
pat
is an array of pattern objects, the size of the first dimension of the array must be 1.
For more information, see Tall Arrays.
Distributed Arrays
Partition large arrays across the combined memory of your cluster using Parallel Computing Toolbox™. (since R2024b)
This function fully supports distributed arrays. For more information, see Run MATLAB Functions with Distributed Arrays (Parallel Computing Toolbox).
Version History
Introduced in R2020b