contains - Check if pattern is substring in documents - MATLAB (original) (raw)
Main Content
Check if pattern is substring in documents
Since R2022b
Syntax
Description
tf = contains([documents](#d126e11376),[pat](#mw%5Ff006356e-83f4-4e2c-a3e6-203e0c08739b))
returns 1
where any token of documents
containspat
and returns 0
otherwise.
tf = contains([documents](#d126e11376),[pat](#mw%5Ff006356e-83f4-4e2c-a3e6-203e0c08739b),IgnoreCase=[flag](#mw%5F57f22711-66db-42f7-8887-fefb847b5b8b%5Fsep%5Fmw%5Fb8f02aef-cfb4-49bd-8d6b-95737f85fa97))
also specifies whether to ignore letter case when checking substrings.
Tip
Use the contains
function to check substrings of the words in documents by specifying substrings or patterns. To check entire words and n-grams in documents, use the containsWords and containsNgrams functions respectively.
Examples
Create an array of tokenized documents.
documents = tokenizedDocument([ "an example of a short sentence" "a second short sentence"]);
Check for matches of the string "short"
.
tf = contains(documents,"short")
tf = 2×1 logical array
1 1
Check for matches of the string "ex"
.
tf = contains(documents,"ex")
tf = 2×1 logical array
1 0
Input Arguments
Substring or pattern to check, specified as one of these values:
- String array
- Character vector
- Cell array of character vectors
- pattern array
If pat
contains multiple substrings or patterns, then the function returns 1
if any matching substrings or patterns appear in the corresponding document.
Option to ignore case, specified as one of these values:
0
(false
) – Treat candidate matches that differ only by letter case as nonmatching.1
(true
) – Treat candidate matches that differ only by letter case as matching.
Data Types: single
| double
| int8
| int16
| int32
| int64
| uint8
| uint16
| uint32
| uint64
| logical
Version History
Introduced in R2022b