bertDocumentClassifier - BERT document classifier - MATLAB (original) (raw)

BERT document classifier

Since R2023b

Description

A Bidirectional Encoder Representations from Transformer (BERT) model is a transformer neural network that can be fine-tuned for natural language processing tasks such as document classification and sentiment analysis. The network uses attention layers to analyze text in context and capture long-range dependencies between words.

Creation

Description

`mdl` = bertDocumentClassifier creates abertDocumentClassifier object.

example

`mdl` = bertDocumentClassifier([net](#mw%5Ff27a1d0f-37a1-4dec-b3ff-ba5227992104),[tokenizer](#mw%5Fb4298fe2-ab25-4454-9618-ad2aff148bce)) creates a bertDocumentClassifier object from the specified BERT neural network and tokenizer.

`mdl` = bertDocumentClassifier(___,[Name=Value](#namevaluepairarguments)) sets the ClassNames property and additional options using one or more name-value arguments.

example

Input Arguments

expand all

`net` — BERT neural network

dlnetwork object

BERT neural network, specified as a dlnetwork (Deep Learning Toolbox) object.

If you specify the net argument, then you must not specify the Model argument. The network must have three sequence input layers with input sizes of one. The output size of the network must match the number of classes in the ClassNames property. The inputs innet.InputNames(1), net.InputNames(2), andnet.InputNames(3) must be the inputs for the input data, the attention mask, and the segments, respectively.

`tokenizer` — BERT tokenizer

bertTokenizer object

BERT tokenizer, specified as a bertTokenizer object.

If you specify the tokenizer argument, then you must not specify the Model argument.

Name-Value Arguments

Specify optional pairs of arguments asName1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Example: bertDocumentClassifier(Model="tiny") creates a BERT-Tiny document classifier.

`Model` — BERT model

BERT model, specified as one of these options:

"base" — BERT-Base model. This option requires theText Analytics Toolbox™ Model for BERT-Base Network support package. This model has 108.8 million learnable parameters.
"tiny" — BERT-Tiny model. This option requires theText Analytics Toolbox Model for BERT-Tiny Network support package. This model has 4.3 million learnable parameters.
"mini" — BERT-Mini model. This option requires theText Analytics Toolbox Model for BERT-Mini Network support package. This model has 11.1 million learnable parameters.
"small" — BERT-Small model. This option requires theText Analytics Toolbox Model for BERT-Small Network support package. This model has 28.5 million learnable parameters.
"large" — BERT-Large model. This option requires theText Analytics Toolbox Model for BERT-Large Network support package. This model has 334 million learnable parameters.
"multilingual" — BERT-Base multilingual model. This option requires the Text Analytics Toolbox Model for BERT-Base Multilingual Cased Network support package. This model has 177.2 million learnable parameters.

If you specify the Model argument, then you must not specify the net and tokenizer arguments.

Tip

To customize the BERT neural network architecture, modify the dlnetwork (Deep Learning Toolbox) object output of thebert function and use the net andtokenizer arguments.

`DropoutProbability` — Probability of dropping out input elements in dropout layers

0.1 (default) | scalar in the range [0, 1)

Probability of dropping out input elements in dropout layers, specified as a scalar in the range [0, 1).

When you train a neural network with dropout layers, the layer randomly sets input elements to zero using the dropout mask rand(size(X)) < p, where X is the layer input and p is the layer dropout probability. The layer then scales the remaining elements by 1/(1-p).

This operation helps to prevent the network from overfitting [2], [3]. A higher number results in the network dropping more elements during training. At prediction time, the output of the layer is equal to its input.

`AttentionDropoutProbability` — Probability of dropping out input elements in attention layers

0.1 (default) | scalar in the range [0, 1)

Probability of dropping out input elements in attention layers, specified as a scalar in the range [0, 1).

When you train a neural network with attention layers, the layer randomly sets attention scores to zero using the dropout mask rand(size(scores)) < p, where scores is the layer input and p is the layer dropout probability. The layer then scales the remaining elements by 1/(1-p).

Properties

expand all

`Network` — Pretrained BERT model

dlnetwork object

This property is read-only.

Pretrained BERT model, specified as a dlnetwork (Deep Learning Toolbox) object corresponding to thenet or Model argument.

`Tokenizer` — BERT tokenizer

bertTokenizer object

This property is read-only.

BERT tokenizer, specified as a bertTokenizer object corresponding to the tokenizer orModel argument.

`ClassNames` — Class names

["positive" "negative"] (default) | categorical vector | string array | cell array of character vectors

Class names, specified as a categorical vector, a string array, or a cell array of character vectors.

If you specify the net argument, then the output size of the network must match the number of classes.

To set this property, use the corresponding name-value argument when you create thebertDocumentClassifier object. After you create a bertDocumentClassifier object, this property is read-only.

Data Types: string | cell | categorical

Object Functions

classify	Classify document using BERT document classifier

Examples

collapse all

Create BERT Document Classifier for Training

Create a BERT document classifier that is ready for training.

mdl = bertDocumentClassifier

mdl = bertDocumentClassifier with properties:

   Network: [1x1 dlnetwork]
 Tokenizer: [1x1 bertTokenizer]
ClassNames: ["positive"    "negative"]

View the class names.

ans = 1x2 string "positive" "negative"

Specify BERT Document Classifier Classes

Create a BERT document classifier for the classes "Electrical Failure", "Leak", "Mechanical Failure", and "Software Failure".

classNames = ["Electrical Failure" "Leak" "Mechanical Failure" "Software Failure"]; mdl = bertDocumentClassifier(ClassNames=classNames)

mdl = bertDocumentClassifier with properties:

   Network: [1x1 dlnetwork]
 Tokenizer: [1x1 bertTokenizer]
ClassNames: ["Electrical Failure"    "Leak"    "Mechanical Failure"    "Software Failure"]

View the class names.

ans = 1x4 string "Electrical Failure" "Leak" "Mechanical Failure" "Software Failure"

References

[1] Devlin, Jacob, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. "BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding" Preprint, submitted May 24, 2019. https://doi.org/10.48550/arXiv.1810.04805.

[2] Srivastava, Nitish, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. "Dropout: A Simple Way to Prevent Neural Networks from Overfitting." The Journal of Machine Learning Research 15, no. 1 (January 1, 2014): 1929–58

[3] Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "ImageNet Classification with Deep Convolutional Neural Networks."Communications of the ACM 60, no. 6 (May 24, 2017): 84–90. https://doi.org/10.1145/3065386.

Version History

Introduced in R2023b

bertDocumentClassifier - BERT document classifier - MATLAB (original) (raw)

Description

Creation

Description

Input Arguments

net — BERT neural network

tokenizer — BERT tokenizer

Model — BERT model

DropoutProbability — Probability of dropping out input elements in dropout layers

AttentionDropoutProbability — Probability of dropping out input elements in attention layers

Properties

Network — Pretrained BERT model

Tokenizer — BERT tokenizer

ClassNames — Class names

Object Functions

Examples

Create BERT Document Classifier for Training

Specify BERT Document Classifier Classes

References

Version History

`net` — BERT neural network

`tokenizer` — BERT tokenizer

`Model` — BERT model

`DropoutProbability` — Probability of dropping out input elements in dropout layers

`AttentionDropoutProbability` — Probability of dropping out input elements in attention layers

`Network` — Pretrained BERT model

`Tokenizer` — BERT tokenizer

`ClassNames` — Class names