contains - Check if pattern is substring in documents - MATLAB (original) (raw)

Main Content

Check if pattern is substring in documents

Since R2022b

Syntax

Description

tf = contains([documents](#d126e11376),[pat](#mw%5Ff006356e-83f4-4e2c-a3e6-203e0c08739b)) returns 1 where any token of documents containspat and returns 0 otherwise.

example

tf = contains([documents](#d126e11376),[pat](#mw%5Ff006356e-83f4-4e2c-a3e6-203e0c08739b),IgnoreCase=[flag](#mw%5F57f22711-66db-42f7-8887-fefb847b5b8b%5Fsep%5Fmw%5Fb8f02aef-cfb4-49bd-8d6b-95737f85fa97)) also specifies whether to ignore letter case when checking substrings.

Tip

Use the contains function to check substrings of the words in documents by specifying substrings or patterns. To check entire words and n-grams in documents, use the containsWords and containsNgrams functions respectively.

Examples

collapse all

Create an array of tokenized documents.

documents = tokenizedDocument([ "an example of a short sentence" "a second short sentence"]);

Check for matches of the string "short".

tf = contains(documents,"short")

tf = 2×1 logical array

1 1

Check for matches of the string "ex".

tf = contains(documents,"ex")

tf = 2×1 logical array

1 0

Input Arguments

collapse all

Substring or pattern to check, specified as one of these values:

If pat contains multiple substrings or patterns, then the function returns 1 if any matching substrings or patterns appear in the corresponding document.

Option to ignore case, specified as one of these values:

Data Types: single | double | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64 | logical

Version History

Introduced in R2022b