extract - Extract substrings from strings - MATLAB (original) (raw)

Extract substrings from strings

Since R2020b

Syntax

Description

[newStr](#mw%5F92b33e21-ed9e-4f0e-9c4f-c1a36c6631a8) = extract([str](#mw%5F6f9c458d-4360-47aa-b8c8-fc07eb1070b0%5Fsep%5Fbu4l86d-str),[pat](#mw%5F6b932a41-2bad-4590-9a16-939009af1565)) returns any substrings in str that match the pattern specified bypat.

If str is a string array or a cell array of character vectors, then the function extracts substrings from each element of str. Ifpat is an array, then the function matches against multiple patterns.

example

[newStr](#mw%5F92b33e21-ed9e-4f0e-9c4f-c1a36c6631a8) = extract([str](#mw%5F6f9c458d-4360-47aa-b8c8-fc07eb1070b0%5Fsep%5Fbu4l86d-str),[pos](#mw%5Fbcbf7a33-8948-4317-9107-24f3ab371c63)) returns the character in str at the position specified bypos.

example

Examples

collapse all

Extract ZIP Codes from Addresses

Create a string array that contains addresses. Each address ends with a US ZIP code.

str = ["73 Beacon St., Boston, MA, 02116"; "1640 Riverside Dr., Hill Valley, CA, 92530"; "138 Main St., Cambridge, MA, 02138"]

str = 3x1 string "73 Beacon St., Boston, MA, 02116" "1640 Riverside Dr., Hill Valley, CA, 92530" "138 Main St., Cambridge, MA, 02138"

Create a pattern that matches any sequence of digits.

pat = pattern Matching:

digitsPattern

Use it to extract all sequences of digits from the addresses.

newStr = extract(str,pat)

newStr = 3x2 string "73" "02116" "1640" "92530" "138" "02138"

The digitsPattern pattern matches street numbers, apartment numbers, and ZIP codes. To match only ZIP codes, create a pattern that matches a sequence of digits at the end of an address.

pat = digitsPattern + textBoundary

pat = pattern Matching:

digitsPattern + textBoundary

Extract the ZIP codes.

newStr = extract(str,pat)

newStr = 3x1 string "02116" "92530" "02138"

For a list of functions that create pattern objects, see pattern.

Extract Character at Numeric Position

Create a string.

str = "All's well that ends well"

str = "All's well that ends well"

Extract the first character in the string.

Extract the last character.

extract(str,strlength(str))

Input Arguments

collapse all

str — Input text

string array | character vector | cell array of character vectors

Input text, specified as a string array, character vector, or cell array of character vectors.

pat — Search pattern

string array | character vector | cell array of character vectors | pattern array

Search pattern, specified as one of the following:

pos — Position

numeric array

Position, specified as a numeric array.

If str is a string array or cell array of character vectors, thenpos can be a numeric scalar or numeric array of the same size asstr.

Output Arguments

collapse all

newStr — Output text

string array | cell array of character vectors

Output text, returned as a string array or cell array of character vectors.

If str is a string array, then newStr is also a string array. Otherwise, newStr is a cell array of character vectors.

Extended Capabilities

Tall Arrays

Calculate with arrays that have more rows than fit in memory.

Theextract function supports tall arrays with the following usage notes and limitations:

For more information, see Tall Arrays.

Distributed Arrays

Partition large arrays across the combined memory of your cluster using Parallel Computing Toolbox™. (since R2024b)

This function fully supports distributed arrays. For more information, see Run MATLAB Functions with Distributed Arrays (Parallel Computing Toolbox).

Version History

Introduced in R2020b