LUKE (original) (raw)

PyTorch

Overview

The LUKE model was proposed in LUKE: Deep Contextualized Entity Representations with Entity-aware Self-attention by Ikuya Yamada, Akari Asai, Hiroyuki Shindo, Hideaki Takeda and Yuji Matsumoto. It is based on RoBERTa and adds entity embeddings as well as an entity-aware self-attention mechanism, which helps improve performance on various downstream tasks involving reasoning about entities such as named entity recognition, extractive and cloze-style question answering, entity typing, and relation classification.

The abstract from the paper is the following:

Entity representations are useful in natural language tasks involving entities. In this paper, we propose new pretrained contextualized representations of words and entities based on the bidirectional transformer. The proposed model treats words and entities in a given text as independent tokens, and outputs contextualized representations of them. Our model is trained using a new pretraining task based on the masked language model of BERT. The task involves predicting randomly masked words and entities in a large entity-annotated corpus retrieved from Wikipedia. We also propose an entity-aware self-attention mechanism that is an extension of the self-attention mechanism of the transformer, and considers the types of tokens (words or entities) when computing attention scores. The proposed model achieves impressive empirical performance on a wide range of entity-related tasks. In particular, it obtains state-of-the-art results on five well-known datasets: Open Entity (entity typing), TACRED (relation classification), CoNLL-2003 (named entity recognition), ReCoRD (cloze-style question answering), and SQuAD 1.1 (extractive question answering).

This model was contributed by ikuyamada and nielsr. The original code can be found here.

Usage tips

Usage example:

from transformers import LukeTokenizer, LukeModel, LukeForEntityPairClassification

model = LukeModel.from_pretrained("studio-ousia/luke-base") tokenizer = LukeTokenizer.from_pretrained("studio-ousia/luke-base")

text = "Beyoncé lives in Los Angeles." entity_spans = [(0, 7)]
inputs = tokenizer(text, entity_spans=entity_spans, add_prefix_space=True, return_tensors="pt") outputs = model(**inputs) word_last_hidden_state = outputs.last_hidden_state entity_last_hidden_state = outputs.entity_last_hidden_state

entities = [ ... "Beyoncé", ... "Los Angeles", ... ]
entity_spans = [(0, 7), (17, 28)]
inputs = tokenizer(text, entities=entities, entity_spans=entity_spans, add_prefix_space=True, return_tensors="pt") outputs = model(**inputs) word_last_hidden_state = outputs.last_hidden_state entity_last_hidden_state = outputs.entity_last_hidden_state

model = LukeForEntityPairClassification.from_pretrained("studio-ousia/luke-large-finetuned-tacred") tokenizer = LukeTokenizer.from_pretrained("studio-ousia/luke-large-finetuned-tacred") entity_spans = [(0, 7), (17, 28)]
inputs = tokenizer(text, entity_spans=entity_spans, return_tensors="pt") outputs = model(**inputs) logits = outputs.logits predicted_class_idx = int(logits[0].argmax()) print("Predicted class:", model.config.id2label[predicted_class_idx])

Resources

LukeConfig

class transformers.LukeConfig

< source >

( vocab_size = 50267 entity_vocab_size = 500000 hidden_size = 768 entity_emb_size = 256 num_hidden_layers = 12 num_attention_heads = 12 intermediate_size = 3072 hidden_act = 'gelu' hidden_dropout_prob = 0.1 attention_probs_dropout_prob = 0.1 max_position_embeddings = 512 type_vocab_size = 2 initializer_range = 0.02 layer_norm_eps = 1e-12 use_entity_aware_attention = True classifier_dropout = None pad_token_id = 1 bos_token_id = 0 eos_token_id = 2 **kwargs )

Parameters

This is the configuration class to store the configuration of a LukeModel. It is used to instantiate a LUKE model according to the specified arguments, defining the model architecture. Instantiating a configuration with the defaults will yield a similar configuration to that of the LUKEstudio-ousia/luke-base architecture.

Configuration objects inherit from PretrainedConfig and can be used to control the model outputs. Read the documentation from PretrainedConfig for more information.

Examples:

from transformers import LukeConfig, LukeModel

configuration = LukeConfig()

model = LukeModel(configuration)

configuration = model.config

LukeTokenizer

class transformers.LukeTokenizer

< source >

( vocab_file merges_file entity_vocab_file task = None max_entity_length = 32 max_mention_length = 30 entity_token_1 = '' entity_token_2 = '' entity_unk_token = '[UNK]' entity_pad_token = '[PAD]' entity_mask_token = '[MASK]' entity_mask2_token = '[MASK2]' errors = 'replace' bos_token = '' eos_token = '' sep_token = '' cls_token = '' unk_token = '' pad_token = '' mask_token = '' add_prefix_space = False **kwargs )

Parameters

Constructs a LUKE tokenizer, derived from the GPT-2 tokenizer, using byte-level Byte-Pair-Encoding.

This tokenizer has been trained to treat spaces like parts of the tokens (a bit like sentencepiece) so a word will

be encoded differently whether it is at the beginning of the sentence (without space) or not:

from transformers import LukeTokenizer

tokenizer = LukeTokenizer.from_pretrained("studio-ousia/luke-base") tokenizer("Hello world")["input_ids"] [0, 31414, 232, 2]

tokenizer(" Hello world")["input_ids"] [0, 20920, 232, 2]

You can get around that behavior by passing add_prefix_space=True when instantiating this tokenizer or when you call it on some text, but since the model was not pretrained this way, it might yield a decrease in performance.

When used with is_split_into_words=True, this tokenizer will add a space before each word (even the first one).

This tokenizer inherits from PreTrainedTokenizer which contains most of the main methods. Users should refer to this superclass for more information regarding those methods. It also creates entity sequences, namelyentity_ids, entity_attention_mask, entity_token_type_ids, and entity_position_ids to be used by the LUKE model.

__call__

< source >

( text: typing.Union[str, typing.List[str]] text_pair: typing.Union[str, typing.List[str], NoneType] = None entity_spans: typing.Union[typing.List[typing.Tuple[int, int]], typing.List[typing.List[typing.Tuple[int, int]]], NoneType] = None entity_spans_pair: typing.Union[typing.List[typing.Tuple[int, int]], typing.List[typing.List[typing.Tuple[int, int]]], NoneType] = None entities: typing.Union[typing.List[str], typing.List[typing.List[str]], NoneType] = None entities_pair: typing.Union[typing.List[str], typing.List[typing.List[str]], NoneType] = None add_special_tokens: bool = True padding: typing.Union[bool, str, transformers.utils.generic.PaddingStrategy] = False truncation: typing.Union[bool, str, transformers.tokenization_utils_base.TruncationStrategy] = None max_length: typing.Optional[int] = None max_entity_length: typing.Optional[int] = None stride: int = 0 is_split_into_words: typing.Optional[bool] = False pad_to_multiple_of: typing.Optional[int] = None padding_side: typing.Optional[str] = None return_tensors: typing.Union[str, transformers.utils.generic.TensorType, NoneType] = None return_token_type_ids: typing.Optional[bool] = None return_attention_mask: typing.Optional[bool] = None return_overflowing_tokens: bool = False return_special_tokens_mask: bool = False return_offsets_mapping: bool = False return_length: bool = False verbose: bool = True **kwargs ) → BatchEncoding

Parameters

A BatchEncoding with the following fields:

Main method to tokenize and prepare for the model one or several sequence(s) or one or several pair(s) of sequences, depending on the task you want to prepare them for.

save_vocabulary

< source >

( save_directory: str filename_prefix: typing.Optional[str] = None )

LukeModel

class transformers.LukeModel

< source >

( config: LukeConfig add_pooling_layer: bool = True )

Parameters

The bare LUKE model transformer outputting raw hidden-states for both word tokens and entities without any specific head on top.

This model inherits from PreTrainedModel. Check the superclass documentation for the generic methods the library implements for all its model (such as downloading or saving, resizing the input embeddings, pruning heads etc.)

This model is also a PyTorch torch.nn.Module subclass. Use it as a regular PyTorch Module and refer to the PyTorch documentation for all matter related to general usage and behavior.

forward

< source >

( input_ids: typing.Optional[torch.LongTensor] = None attention_mask: typing.Optional[torch.FloatTensor] = None token_type_ids: typing.Optional[torch.LongTensor] = None position_ids: typing.Optional[torch.LongTensor] = None entity_ids: typing.Optional[torch.LongTensor] = None entity_attention_mask: typing.Optional[torch.FloatTensor] = None entity_token_type_ids: typing.Optional[torch.LongTensor] = None entity_position_ids: typing.Optional[torch.LongTensor] = None head_mask: typing.Optional[torch.FloatTensor] = None inputs_embeds: typing.Optional[torch.FloatTensor] = None output_attentions: typing.Optional[bool] = None output_hidden_states: typing.Optional[bool] = None return_dict: typing.Optional[bool] = None ) → transformers.models.luke.modeling_luke.BaseLukeModelOutputWithPooling or tuple(torch.FloatTensor)

Parameters

Returns

transformers.models.luke.modeling_luke.BaseLukeModelOutputWithPooling or tuple(torch.FloatTensor)

A transformers.models.luke.modeling_luke.BaseLukeModelOutputWithPooling or a tuple oftorch.FloatTensor (if return_dict=False is passed or when config.return_dict=False) comprising various elements depending on the configuration (LukeConfig) and inputs.

The LukeModel forward method, overrides the __call__ special method.

Although the recipe for forward pass needs to be defined within this function, one should call the Moduleinstance afterwards instead of this since the former takes care of running the pre and post processing steps while the latter silently ignores them.

Examples:

from transformers import AutoTokenizer, LukeModel

tokenizer = AutoTokenizer.from_pretrained("studio-ousia/luke-base") model = LukeModel.from_pretrained("studio-ousia/luke-base")

text = "Beyoncé lives in Los Angeles." entity_spans = [(0, 7)]

encoding = tokenizer(text, entity_spans=entity_spans, add_prefix_space=True, return_tensors="pt") outputs = model(**encoding) word_last_hidden_state = outputs.last_hidden_state entity_last_hidden_state = outputs.entity_last_hidden_state

text = "Beyoncé lives in Los Angeles." entities = [ ... "Beyoncé", ... "Los Angeles", ... ]
entity_spans = [ ... (0, 7), ... (17, 28), ... ]

encoding = tokenizer( ... text, entities=entities, entity_spans=entity_spans, add_prefix_space=True, return_tensors="pt" ... ) outputs = model(**encoding) word_last_hidden_state = outputs.last_hidden_state entity_last_hidden_state = outputs.entity_last_hidden_state

LukeForMaskedLM

class transformers.LukeForMaskedLM

< source >

( config )

Parameters

The LUKE model with a language modeling head and entity prediction head on top for masked language modeling and masked entity prediction.

This model inherits from PreTrainedModel. Check the superclass documentation for the generic methods the library implements for all its model (such as downloading or saving, resizing the input embeddings, pruning heads etc.)

This model is also a PyTorch torch.nn.Module subclass. Use it as a regular PyTorch Module and refer to the PyTorch documentation for all matter related to general usage and behavior.

forward

< source >

( input_ids: typing.Optional[torch.LongTensor] = None attention_mask: typing.Optional[torch.FloatTensor] = None token_type_ids: typing.Optional[torch.LongTensor] = None position_ids: typing.Optional[torch.LongTensor] = None entity_ids: typing.Optional[torch.LongTensor] = None entity_attention_mask: typing.Optional[torch.LongTensor] = None entity_token_type_ids: typing.Optional[torch.LongTensor] = None entity_position_ids: typing.Optional[torch.LongTensor] = None labels: typing.Optional[torch.LongTensor] = None entity_labels: typing.Optional[torch.LongTensor] = None head_mask: typing.Optional[torch.FloatTensor] = None inputs_embeds: typing.Optional[torch.FloatTensor] = None output_attentions: typing.Optional[bool] = None output_hidden_states: typing.Optional[bool] = None return_dict: typing.Optional[bool] = None ) → transformers.models.luke.modeling_luke.LukeMaskedLMOutput or tuple(torch.FloatTensor)

Parameters

Returns

transformers.models.luke.modeling_luke.LukeMaskedLMOutput or tuple(torch.FloatTensor)

A transformers.models.luke.modeling_luke.LukeMaskedLMOutput or a tuple oftorch.FloatTensor (if return_dict=False is passed or when config.return_dict=False) comprising various elements depending on the configuration (LukeConfig) and inputs.

The LukeForMaskedLM forward method, overrides the __call__ special method.

Although the recipe for forward pass needs to be defined within this function, one should call the Moduleinstance afterwards instead of this since the former takes care of running the pre and post processing steps while the latter silently ignores them.

LukeForEntityClassification

class transformers.LukeForEntityClassification

< source >

( config )

Parameters

The LUKE model with a classification head on top (a linear layer on top of the hidden state of the first entity token) for entity classification tasks, such as Open Entity.

This model inherits from PreTrainedModel. Check the superclass documentation for the generic methods the library implements for all its model (such as downloading or saving, resizing the input embeddings, pruning heads etc.)

This model is also a PyTorch torch.nn.Module subclass. Use it as a regular PyTorch Module and refer to the PyTorch documentation for all matter related to general usage and behavior.

forward

< source >

( input_ids: typing.Optional[torch.LongTensor] = None attention_mask: typing.Optional[torch.FloatTensor] = None token_type_ids: typing.Optional[torch.LongTensor] = None position_ids: typing.Optional[torch.LongTensor] = None entity_ids: typing.Optional[torch.LongTensor] = None entity_attention_mask: typing.Optional[torch.FloatTensor] = None entity_token_type_ids: typing.Optional[torch.LongTensor] = None entity_position_ids: typing.Optional[torch.LongTensor] = None head_mask: typing.Optional[torch.FloatTensor] = None inputs_embeds: typing.Optional[torch.FloatTensor] = None labels: typing.Optional[torch.FloatTensor] = None output_attentions: typing.Optional[bool] = None output_hidden_states: typing.Optional[bool] = None return_dict: typing.Optional[bool] = None ) → transformers.models.luke.modeling_luke.EntityClassificationOutput or tuple(torch.FloatTensor)

Parameters

Returns

transformers.models.luke.modeling_luke.EntityClassificationOutput or tuple(torch.FloatTensor)

A transformers.models.luke.modeling_luke.EntityClassificationOutput or a tuple oftorch.FloatTensor (if return_dict=False is passed or when config.return_dict=False) comprising various elements depending on the configuration (LukeConfig) and inputs.

The LukeForEntityClassification forward method, overrides the __call__ special method.

Although the recipe for forward pass needs to be defined within this function, one should call the Moduleinstance afterwards instead of this since the former takes care of running the pre and post processing steps while the latter silently ignores them.

Examples:

from transformers import AutoTokenizer, LukeForEntityClassification

tokenizer = AutoTokenizer.from_pretrained("studio-ousia/luke-large-finetuned-open-entity") model = LukeForEntityClassification.from_pretrained("studio-ousia/luke-large-finetuned-open-entity")

text = "Beyoncé lives in Los Angeles." entity_spans = [(0, 7)]
inputs = tokenizer(text, entity_spans=entity_spans, return_tensors="pt") outputs = model(**inputs) logits = outputs.logits predicted_class_idx = logits.argmax(-1).item() print("Predicted class:", model.config.id2label[predicted_class_idx]) Predicted class: person

LukeForEntityPairClassification

class transformers.LukeForEntityPairClassification

< source >

( config )

Parameters

The LUKE model with a classification head on top (a linear layer on top of the hidden states of the two entity tokens) for entity pair classification tasks, such as TACRED.

This model inherits from PreTrainedModel. Check the superclass documentation for the generic methods the library implements for all its model (such as downloading or saving, resizing the input embeddings, pruning heads etc.)

This model is also a PyTorch torch.nn.Module subclass. Use it as a regular PyTorch Module and refer to the PyTorch documentation for all matter related to general usage and behavior.

forward

< source >

( input_ids: typing.Optional[torch.LongTensor] = None attention_mask: typing.Optional[torch.FloatTensor] = None token_type_ids: typing.Optional[torch.LongTensor] = None position_ids: typing.Optional[torch.LongTensor] = None entity_ids: typing.Optional[torch.LongTensor] = None entity_attention_mask: typing.Optional[torch.FloatTensor] = None entity_token_type_ids: typing.Optional[torch.LongTensor] = None entity_position_ids: typing.Optional[torch.LongTensor] = None head_mask: typing.Optional[torch.FloatTensor] = None inputs_embeds: typing.Optional[torch.FloatTensor] = None labels: typing.Optional[torch.LongTensor] = None output_attentions: typing.Optional[bool] = None output_hidden_states: typing.Optional[bool] = None return_dict: typing.Optional[bool] = None ) → transformers.models.luke.modeling_luke.EntityPairClassificationOutput or tuple(torch.FloatTensor)

Parameters

Returns

transformers.models.luke.modeling_luke.EntityPairClassificationOutput or tuple(torch.FloatTensor)

A transformers.models.luke.modeling_luke.EntityPairClassificationOutput or a tuple oftorch.FloatTensor (if return_dict=False is passed or when config.return_dict=False) comprising various elements depending on the configuration (LukeConfig) and inputs.

The LukeForEntityPairClassification forward method, overrides the __call__ special method.

Although the recipe for forward pass needs to be defined within this function, one should call the Moduleinstance afterwards instead of this since the former takes care of running the pre and post processing steps while the latter silently ignores them.

Examples:

from transformers import AutoTokenizer, LukeForEntityPairClassification

tokenizer = AutoTokenizer.from_pretrained("studio-ousia/luke-large-finetuned-tacred") model = LukeForEntityPairClassification.from_pretrained("studio-ousia/luke-large-finetuned-tacred")

text = "Beyoncé lives in Los Angeles." entity_spans = [ ... (0, 7), ... (17, 28), ... ]
inputs = tokenizer(text, entity_spans=entity_spans, return_tensors="pt") outputs = model(**inputs) logits = outputs.logits predicted_class_idx = logits.argmax(-1).item() print("Predicted class:", model.config.id2label[predicted_class_idx]) Predicted class: per:cities_of_residence

LukeForEntitySpanClassification

class transformers.LukeForEntitySpanClassification

< source >

( config )

Parameters

The LUKE model with a span classification head on top (a linear layer on top of the hidden states output) for tasks such as named entity recognition.

This model inherits from PreTrainedModel. Check the superclass documentation for the generic methods the library implements for all its model (such as downloading or saving, resizing the input embeddings, pruning heads etc.)

This model is also a PyTorch torch.nn.Module subclass. Use it as a regular PyTorch Module and refer to the PyTorch documentation for all matter related to general usage and behavior.

forward

< source >

( input_ids: typing.Optional[torch.LongTensor] = None attention_mask: typing.Optional[torch.FloatTensor] = None token_type_ids: typing.Optional[torch.LongTensor] = None position_ids: typing.Optional[torch.LongTensor] = None entity_ids: typing.Optional[torch.LongTensor] = None entity_attention_mask: typing.Optional[torch.LongTensor] = None entity_token_type_ids: typing.Optional[torch.LongTensor] = None entity_position_ids: typing.Optional[torch.LongTensor] = None entity_start_positions: typing.Optional[torch.LongTensor] = None entity_end_positions: typing.Optional[torch.LongTensor] = None head_mask: typing.Optional[torch.FloatTensor] = None inputs_embeds: typing.Optional[torch.FloatTensor] = None labels: typing.Optional[torch.LongTensor] = None output_attentions: typing.Optional[bool] = None output_hidden_states: typing.Optional[bool] = None return_dict: typing.Optional[bool] = None ) → transformers.models.luke.modeling_luke.EntitySpanClassificationOutput or tuple(torch.FloatTensor)

Parameters

Returns

transformers.models.luke.modeling_luke.EntitySpanClassificationOutput or tuple(torch.FloatTensor)

A transformers.models.luke.modeling_luke.EntitySpanClassificationOutput or a tuple oftorch.FloatTensor (if return_dict=False is passed or when config.return_dict=False) comprising various elements depending on the configuration (LukeConfig) and inputs.

The LukeForEntitySpanClassification forward method, overrides the __call__ special method.

Although the recipe for forward pass needs to be defined within this function, one should call the Moduleinstance afterwards instead of this since the former takes care of running the pre and post processing steps while the latter silently ignores them.

Examples:

from transformers import AutoTokenizer, LukeForEntitySpanClassification

tokenizer = AutoTokenizer.from_pretrained("studio-ousia/luke-large-finetuned-conll-2003") model = LukeForEntitySpanClassification.from_pretrained("studio-ousia/luke-large-finetuned-conll-2003")

text = "Beyoncé lives in Los Angeles"

word_start_positions = [0, 8, 14, 17, 21]
word_end_positions = [7, 13, 16, 20, 28]
entity_spans = [] for i, start_pos in enumerate(word_start_positions): ... for end_pos in word_end_positions[i:]: ... entity_spans.append((start_pos, end_pos))

inputs = tokenizer(text, entity_spans=entity_spans, return_tensors="pt") outputs = model(**inputs) logits = outputs.logits predicted_class_indices = logits.argmax(-1).squeeze().tolist() for span, predicted_class_idx in zip(entity_spans, predicted_class_indices): ... if predicted_class_idx != 0: ... print(text[span[0] : span[1]], model.config.id2label[predicted_class_idx]) Beyoncé PER Los Angeles LOC

LukeForSequenceClassification

class transformers.LukeForSequenceClassification

< source >

( config )

Parameters

The LUKE Model transformer with a sequence classification/regression head on top (a linear layer on top of the pooled output) e.g. for GLUE tasks.

This model inherits from PreTrainedModel. Check the superclass documentation for the generic methods the library implements for all its model (such as downloading or saving, resizing the input embeddings, pruning heads etc.)

This model is also a PyTorch torch.nn.Module subclass. Use it as a regular PyTorch Module and refer to the PyTorch documentation for all matter related to general usage and behavior.

forward

< source >

( input_ids: typing.Optional[torch.LongTensor] = None attention_mask: typing.Optional[torch.FloatTensor] = None token_type_ids: typing.Optional[torch.LongTensor] = None position_ids: typing.Optional[torch.LongTensor] = None entity_ids: typing.Optional[torch.LongTensor] = None entity_attention_mask: typing.Optional[torch.FloatTensor] = None entity_token_type_ids: typing.Optional[torch.LongTensor] = None entity_position_ids: typing.Optional[torch.LongTensor] = None head_mask: typing.Optional[torch.FloatTensor] = None inputs_embeds: typing.Optional[torch.FloatTensor] = None labels: typing.Optional[torch.FloatTensor] = None output_attentions: typing.Optional[bool] = None output_hidden_states: typing.Optional[bool] = None return_dict: typing.Optional[bool] = None ) → transformers.models.luke.modeling_luke.LukeSequenceClassifierOutput or tuple(torch.FloatTensor)

Parameters

Returns

transformers.models.luke.modeling_luke.LukeSequenceClassifierOutput or tuple(torch.FloatTensor)

A transformers.models.luke.modeling_luke.LukeSequenceClassifierOutput or a tuple oftorch.FloatTensor (if return_dict=False is passed or when config.return_dict=False) comprising various elements depending on the configuration (LukeConfig) and inputs.

The LukeForSequenceClassification forward method, overrides the __call__ special method.

Although the recipe for forward pass needs to be defined within this function, one should call the Moduleinstance afterwards instead of this since the former takes care of running the pre and post processing steps while the latter silently ignores them.

Example of single-label classification:

import torch from transformers import AutoTokenizer, LukeForSequenceClassification

tokenizer = AutoTokenizer.from_pretrained("studio-ousia/luke-base") model = LukeForSequenceClassification.from_pretrained("studio-ousia/luke-base")

inputs = tokenizer("Hello, my dog is cute", return_tensors="pt")

with torch.no_grad(): ... logits = model(**inputs).logits

predicted_class_id = logits.argmax().item()

num_labels = len(model.config.id2label) model = LukeForSequenceClassification.from_pretrained("studio-ousia/luke-base", num_labels=num_labels)

labels = torch.tensor([1]) loss = model(**inputs, labels=labels).loss

Example of multi-label classification:

import torch from transformers import AutoTokenizer, LukeForSequenceClassification

tokenizer = AutoTokenizer.from_pretrained("studio-ousia/luke-base") model = LukeForSequenceClassification.from_pretrained("studio-ousia/luke-base", problem_type="multi_label_classification")

inputs = tokenizer("Hello, my dog is cute", return_tensors="pt")

with torch.no_grad(): ... logits = model(**inputs).logits

predicted_class_ids = torch.arange(0, logits.shape[-1])[torch.sigmoid(logits).squeeze(dim=0) > 0.5]

num_labels = len(model.config.id2label) model = LukeForSequenceClassification.from_pretrained( ... "studio-ousia/luke-base", num_labels=num_labels, problem_type="multi_label_classification" ... )

labels = torch.sum( ... torch.nn.functional.one_hot(predicted_class_ids[None, :].clone(), num_classes=num_labels), dim=1 ... ).to(torch.float) loss = model(**inputs, labels=labels).loss

LukeForMultipleChoice

class transformers.LukeForMultipleChoice

< source >

( config )

Parameters

The LUKE Model with a multiple choice classification head on top (a linear layer on top of the pooled output and a softmax) e.g. for RocStories/SWAG tasks.

This model inherits from PreTrainedModel. Check the superclass documentation for the generic methods the library implements for all its model (such as downloading or saving, resizing the input embeddings, pruning heads etc.)

This model is also a PyTorch torch.nn.Module subclass. Use it as a regular PyTorch Module and refer to the PyTorch documentation for all matter related to general usage and behavior.

forward

< source >

( input_ids: typing.Optional[torch.LongTensor] = None attention_mask: typing.Optional[torch.FloatTensor] = None token_type_ids: typing.Optional[torch.LongTensor] = None position_ids: typing.Optional[torch.LongTensor] = None entity_ids: typing.Optional[torch.LongTensor] = None entity_attention_mask: typing.Optional[torch.FloatTensor] = None entity_token_type_ids: typing.Optional[torch.LongTensor] = None entity_position_ids: typing.Optional[torch.LongTensor] = None head_mask: typing.Optional[torch.FloatTensor] = None inputs_embeds: typing.Optional[torch.FloatTensor] = None labels: typing.Optional[torch.FloatTensor] = None output_attentions: typing.Optional[bool] = None output_hidden_states: typing.Optional[bool] = None return_dict: typing.Optional[bool] = None ) → transformers.models.luke.modeling_luke.LukeMultipleChoiceModelOutput or tuple(torch.FloatTensor)

Parameters

Returns

transformers.models.luke.modeling_luke.LukeMultipleChoiceModelOutput or tuple(torch.FloatTensor)

A transformers.models.luke.modeling_luke.LukeMultipleChoiceModelOutput or a tuple oftorch.FloatTensor (if return_dict=False is passed or when config.return_dict=False) comprising various elements depending on the configuration (LukeConfig) and inputs.

The LukeForMultipleChoice forward method, overrides the __call__ special method.

Although the recipe for forward pass needs to be defined within this function, one should call the Moduleinstance afterwards instead of this since the former takes care of running the pre and post processing steps while the latter silently ignores them.

Example:

from transformers import AutoTokenizer, LukeForMultipleChoice import torch

tokenizer = AutoTokenizer.from_pretrained("studio-ousia/luke-base") model = LukeForMultipleChoice.from_pretrained("studio-ousia/luke-base")

prompt = "In Italy, pizza served in formal settings, such as at a restaurant, is presented unsliced." choice0 = "It is eaten with a fork and a knife." choice1 = "It is eaten while held in the hand." labels = torch.tensor(0).unsqueeze(0)

encoding = tokenizer([prompt, prompt], [choice0, choice1], return_tensors="pt", padding=True) outputs = model(**{k: v.unsqueeze(0) for k, v in encoding.items()}, labels=labels)

loss = outputs.loss logits = outputs.logits

LukeForTokenClassification

class transformers.LukeForTokenClassification

< source >

( config )

Parameters

The LUKE Model with a token classification head on top (a linear layer on top of the hidden-states output). To solve Named-Entity Recognition (NER) task using LUKE, LukeForEntitySpanClassification is more suitable than this class.

This model inherits from PreTrainedModel. Check the superclass documentation for the generic methods the library implements for all its model (such as downloading or saving, resizing the input embeddings, pruning heads etc.)

This model is also a PyTorch torch.nn.Module subclass. Use it as a regular PyTorch Module and refer to the PyTorch documentation for all matter related to general usage and behavior.

forward

< source >

( input_ids: typing.Optional[torch.LongTensor] = None attention_mask: typing.Optional[torch.FloatTensor] = None token_type_ids: typing.Optional[torch.LongTensor] = None position_ids: typing.Optional[torch.LongTensor] = None entity_ids: typing.Optional[torch.LongTensor] = None entity_attention_mask: typing.Optional[torch.FloatTensor] = None entity_token_type_ids: typing.Optional[torch.LongTensor] = None entity_position_ids: typing.Optional[torch.LongTensor] = None head_mask: typing.Optional[torch.FloatTensor] = None inputs_embeds: typing.Optional[torch.FloatTensor] = None labels: typing.Optional[torch.FloatTensor] = None output_attentions: typing.Optional[bool] = None output_hidden_states: typing.Optional[bool] = None return_dict: typing.Optional[bool] = None ) → transformers.models.luke.modeling_luke.LukeTokenClassifierOutput or tuple(torch.FloatTensor)

Parameters

Returns

transformers.models.luke.modeling_luke.LukeTokenClassifierOutput or tuple(torch.FloatTensor)

A transformers.models.luke.modeling_luke.LukeTokenClassifierOutput or a tuple oftorch.FloatTensor (if return_dict=False is passed or when config.return_dict=False) comprising various elements depending on the configuration (LukeConfig) and inputs.

The LukeForTokenClassification forward method, overrides the __call__ special method.

Although the recipe for forward pass needs to be defined within this function, one should call the Moduleinstance afterwards instead of this since the former takes care of running the pre and post processing steps while the latter silently ignores them.

Example:

from transformers import AutoTokenizer, LukeForTokenClassification import torch

tokenizer = AutoTokenizer.from_pretrained("studio-ousia/luke-base") model = LukeForTokenClassification.from_pretrained("studio-ousia/luke-base")

inputs = tokenizer( ... "HuggingFace is a company based in Paris and New York", add_special_tokens=False, return_tensors="pt" ... )

with torch.no_grad(): ... logits = model(**inputs).logits

predicted_token_class_ids = logits.argmax(-1)

predicted_tokens_classes = [model.config.id2label[t.item()] for t in predicted_token_class_ids[0]]

labels = predicted_token_class_ids loss = model(**inputs, labels=labels).loss

LukeForQuestionAnswering

class transformers.LukeForQuestionAnswering

< source >

( config )

Parameters

The LUKE Model with a span classification head on top for extractive question-answering tasks like SQuAD (a linear layers on top of the hidden-states output to compute span start logits and span end logits).

This model inherits from PreTrainedModel. Check the superclass documentation for the generic methods the library implements for all its model (such as downloading or saving, resizing the input embeddings, pruning heads etc.)

This model is also a PyTorch torch.nn.Module subclass. Use it as a regular PyTorch Module and refer to the PyTorch documentation for all matter related to general usage and behavior.

forward

< source >

( input_ids: typing.Optional[torch.LongTensor] = None attention_mask: typing.Optional[torch.FloatTensor] = None token_type_ids: typing.Optional[torch.LongTensor] = None position_ids: typing.Optional[torch.FloatTensor] = None entity_ids: typing.Optional[torch.LongTensor] = None entity_attention_mask: typing.Optional[torch.FloatTensor] = None entity_token_type_ids: typing.Optional[torch.LongTensor] = None entity_position_ids: typing.Optional[torch.LongTensor] = None head_mask: typing.Optional[torch.FloatTensor] = None inputs_embeds: typing.Optional[torch.FloatTensor] = None start_positions: typing.Optional[torch.LongTensor] = None end_positions: typing.Optional[torch.LongTensor] = None output_attentions: typing.Optional[bool] = None output_hidden_states: typing.Optional[bool] = None return_dict: typing.Optional[bool] = None ) → transformers.models.luke.modeling_luke.LukeQuestionAnsweringModelOutput or tuple(torch.FloatTensor)

Parameters

Returns

transformers.models.luke.modeling_luke.LukeQuestionAnsweringModelOutput or tuple(torch.FloatTensor)

A transformers.models.luke.modeling_luke.LukeQuestionAnsweringModelOutput or a tuple oftorch.FloatTensor (if return_dict=False is passed or when config.return_dict=False) comprising various elements depending on the configuration (LukeConfig) and inputs.

The LukeForQuestionAnswering forward method, overrides the __call__ special method.

Although the recipe for forward pass needs to be defined within this function, one should call the Moduleinstance afterwards instead of this since the former takes care of running the pre and post processing steps while the latter silently ignores them.

Example:

from transformers import AutoTokenizer, LukeForQuestionAnswering import torch

tokenizer = AutoTokenizer.from_pretrained("studio-ousia/luke-base") model = LukeForQuestionAnswering.from_pretrained("studio-ousia/luke-base")

question, text = "Who was Jim Henson?", "Jim Henson was a nice puppet"

inputs = tokenizer(question, text, return_tensors="pt") with torch.no_grad(): ... outputs = model(**inputs)

answer_start_index = outputs.start_logits.argmax() answer_end_index = outputs.end_logits.argmax()

predict_answer_tokens = inputs.input_ids[0, answer_start_index : answer_end_index + 1]

target_start_index = torch.tensor([14]) target_end_index = torch.tensor([15])

outputs = model(**inputs, start_positions=target_start_index, end_positions=target_end_index) loss = outputs.loss

< > Update on GitHub