T5 (original) (raw)

T5Model

class transformers.T5Model

< source >

( config: T5Config )

Parameters

The bare T5 Model outputting raw hidden-states without any specific head on top.

This model inherits from PreTrainedModel. Check the superclass documentation for the generic methods the library implements for all its model (such as downloading or saving, resizing the input embeddings, pruning heads etc.)

This model is also a PyTorch torch.nn.Module subclass. Use it as a regular PyTorch Module and refer to the PyTorch documentation for all matter related to general usage and behavior.

forward

< source >

( input_ids: typing.Optional[torch.LongTensor] = None attention_mask: typing.Optional[torch.FloatTensor] = None decoder_input_ids: typing.Optional[torch.LongTensor] = None decoder_attention_mask: typing.Optional[torch.BoolTensor] = None head_mask: typing.Optional[torch.FloatTensor] = None decoder_head_mask: typing.Optional[torch.FloatTensor] = None cross_attn_head_mask: typing.Optional[torch.Tensor] = None encoder_outputs: typing.Optional[tuple[tuple[torch.FloatTensor]]] = None past_key_values: typing.Optional[tuple[tuple[torch.FloatTensor]]] = None inputs_embeds: typing.Optional[torch.Tensor] = None decoder_inputs_embeds: typing.Optional[torch.Tensor] = None use_cache: typing.Optional[bool] = None output_attentions: typing.Optional[bool] = None output_hidden_states: typing.Optional[bool] = None return_dict: typing.Optional[bool] = None cache_position: typing.Optional[torch.LongTensor] = None ) → transformers.modeling_outputs.Seq2SeqModelOutput or tuple(torch.FloatTensor)

Parameters

A transformers.modeling_outputs.Seq2SeqModelOutput or a tuple oftorch.FloatTensor (if return_dict=False is passed or when config.return_dict=False) comprising various elements depending on the configuration (T5Config) and inputs.

The T5Model forward method, overrides the __call__ special method.

Although the recipe for forward pass needs to be defined within this function, one should call the Moduleinstance afterwards instead of this since the former takes care of running the pre and post processing steps while the latter silently ignores them.

Example:

from transformers import AutoTokenizer, T5Model

tokenizer = AutoTokenizer.from_pretrained("google-t5/t5-small") model = T5Model.from_pretrained("google-t5/t5-small")

input_ids = tokenizer( ... "Studies have been shown that owning a dog is good for you", return_tensors="pt" ... ).input_ids
decoder_input_ids = tokenizer("Studies show that", return_tensors="pt").input_ids

decoder_input_ids = model._shift_right(decoder_input_ids)

outputs = model(input_ids=input_ids, decoder_input_ids=decoder_input_ids) last_hidden_states = outputs.last_hidden_state

T5ForConditionalGeneration

class transformers.T5ForConditionalGeneration

< source >

( config: T5Config )

Parameters

T5 Model with a language modeling head on top.

This model inherits from PreTrainedModel. Check the superclass documentation for the generic methods the library implements for all its model (such as downloading or saving, resizing the input embeddings, pruning heads etc.)

This model is also a PyTorch torch.nn.Module subclass. Use it as a regular PyTorch Module and refer to the PyTorch documentation for all matter related to general usage and behavior.

forward

< source >

( input_ids: typing.Optional[torch.LongTensor] = None attention_mask: typing.Optional[torch.FloatTensor] = None decoder_input_ids: typing.Optional[torch.LongTensor] = None decoder_attention_mask: typing.Optional[torch.BoolTensor] = None head_mask: typing.Optional[torch.FloatTensor] = None decoder_head_mask: typing.Optional[torch.FloatTensor] = None cross_attn_head_mask: typing.Optional[torch.Tensor] = None encoder_outputs: typing.Optional[tuple[tuple[torch.Tensor]]] = None past_key_values: typing.Optional[tuple[tuple[torch.Tensor]]] = None inputs_embeds: typing.Optional[torch.FloatTensor] = None decoder_inputs_embeds: typing.Optional[torch.FloatTensor] = None labels: typing.Optional[torch.LongTensor] = None use_cache: typing.Optional[bool] = None output_attentions: typing.Optional[bool] = None output_hidden_states: typing.Optional[bool] = None return_dict: typing.Optional[bool] = None cache_position: typing.Optional[torch.LongTensor] = None ) → transformers.modeling_outputs.Seq2SeqLMOutput or tuple(torch.FloatTensor)

Parameters

A transformers.modeling_outputs.Seq2SeqLMOutput or a tuple oftorch.FloatTensor (if return_dict=False is passed or when config.return_dict=False) comprising various elements depending on the configuration (T5Config) and inputs.

The T5ForConditionalGeneration forward method, overrides the __call__ special method.

Although the recipe for forward pass needs to be defined within this function, one should call the Moduleinstance afterwards instead of this since the former takes care of running the pre and post processing steps while the latter silently ignores them.

Examples:

from transformers import AutoTokenizer, T5ForConditionalGeneration

tokenizer = AutoTokenizer.from_pretrained("google-t5/t5-small") model = T5ForConditionalGeneration.from_pretrained("google-t5/t5-small")

input_ids = tokenizer("The walks in park", return_tensors="pt").input_ids labels = tokenizer(" cute dog the ", return_tensors="pt").input_ids outputs = model(input_ids=input_ids, labels=labels) loss = outputs.loss logits = outputs.logits

input_ids = tokenizer( ... "summarize: studies have shown that owning a dog is good for you", return_tensors="pt" ... ).input_ids
outputs = model.generate(input_ids) print(tokenizer.decode(outputs[0], skip_special_tokens=True))

T5EncoderModel

class transformers.T5EncoderModel

< source >

( config: T5Config )

Parameters

The bare T5 Model outputting raw hidden-states without any specific head on top.

This model inherits from PreTrainedModel. Check the superclass documentation for the generic methods the library implements for all its model (such as downloading or saving, resizing the input embeddings, pruning heads etc.)

This model is also a PyTorch torch.nn.Module subclass. Use it as a regular PyTorch Module and refer to the PyTorch documentation for all matter related to general usage and behavior.

forward

< source >

( input_ids: typing.Optional[torch.LongTensor] = None attention_mask: typing.Optional[torch.FloatTensor] = None head_mask: typing.Optional[torch.FloatTensor] = None inputs_embeds: typing.Optional[torch.FloatTensor] = None output_attentions: typing.Optional[bool] = None output_hidden_states: typing.Optional[bool] = None return_dict: typing.Optional[bool] = None ) → transformers.modeling_outputs.BaseModelOutput or tuple(torch.FloatTensor)

Parameters

A transformers.modeling_outputs.BaseModelOutput or a tuple oftorch.FloatTensor (if return_dict=False is passed or when config.return_dict=False) comprising various elements depending on the configuration (T5Config) and inputs.

The T5EncoderModel forward method, overrides the __call__ special method.

Although the recipe for forward pass needs to be defined within this function, one should call the Moduleinstance afterwards instead of this since the former takes care of running the pre and post processing steps while the latter silently ignores them.

Example:

from transformers import AutoTokenizer, T5EncoderModel

tokenizer = AutoTokenizer.from_pretrained("google-t5/t5-small") model = T5EncoderModel.from_pretrained("google-t5/t5-small") input_ids = tokenizer( ... "Studies have been shown that owning a dog is good for you", return_tensors="pt" ... ).input_ids
outputs = model(input_ids=input_ids) last_hidden_states = outputs.last_hidden_state

T5ForSequenceClassification

class transformers.T5ForSequenceClassification

< source >

( config: T5Config )

Parameters

T5 model with a sequence classification/head on top (a linear layer on top of the pooled output) e.g. for GLUE tasks.

This model inherits from PreTrainedModel. Check the superclass documentation for the generic methods the library implements for all its model (such as downloading or saving, resizing the input embeddings, pruning heads etc.)

This model is also a PyTorch torch.nn.Module subclass. Use it as a regular PyTorch Module and refer to the PyTorch documentation for all matter related to general usage and behavior.

forward

< source >

( input_ids: typing.Optional[torch.LongTensor] = None attention_mask: typing.Optional[torch.Tensor] = None decoder_input_ids: typing.Optional[torch.LongTensor] = None decoder_attention_mask: typing.Optional[torch.LongTensor] = None head_mask: typing.Optional[torch.Tensor] = None decoder_head_mask: typing.Optional[torch.Tensor] = None cross_attn_head_mask: typing.Optional[torch.Tensor] = None encoder_outputs: typing.Optional[list[torch.FloatTensor]] = None inputs_embeds: typing.Optional[torch.FloatTensor] = None decoder_inputs_embeds: typing.Optional[torch.FloatTensor] = None labels: typing.Optional[torch.LongTensor] = None use_cache: typing.Optional[bool] = None output_attentions: typing.Optional[bool] = None output_hidden_states: typing.Optional[bool] = None return_dict: typing.Optional[bool] = None ) → transformers.modeling_outputs.Seq2SeqSequenceClassifierOutput or tuple(torch.FloatTensor)

Parameters

A transformers.modeling_outputs.Seq2SeqSequenceClassifierOutput or a tuple oftorch.FloatTensor (if return_dict=False is passed or when config.return_dict=False) comprising various elements depending on the configuration (T5Config) and inputs.

The T5ForSequenceClassification forward method, overrides the __call__ special method.

Although the recipe for forward pass needs to be defined within this function, one should call the Moduleinstance afterwards instead of this since the former takes care of running the pre and post processing steps while the latter silently ignores them.

Example of single-label classification:

import torch from transformers import AutoTokenizer, T5ForSequenceClassification

tokenizer = AutoTokenizer.from_pretrained("google-t5/t5-small") model = T5ForSequenceClassification.from_pretrained("google-t5/t5-small")

inputs = tokenizer("Hello, my dog is cute", return_tensors="pt")

with torch.no_grad(): ... logits = model(**inputs).logits

predicted_class_id = logits.argmax().item() model.config.id2label[predicted_class_id] ...

num_labels = len(model.config.id2label) model = T5ForSequenceClassification.from_pretrained("google-t5/t5-small", num_labels=num_labels)

labels = torch.tensor([1]) loss = model(**inputs, labels=labels).loss round(loss.item(), 2) ...

Example of multi-label classification:

import torch from transformers import AutoTokenizer, T5ForSequenceClassification

tokenizer = AutoTokenizer.from_pretrained("google-t5/t5-small") model = T5ForSequenceClassification.from_pretrained("google-t5/t5-small", problem_type="multi_label_classification")

inputs = tokenizer("Hello, my dog is cute", return_tensors="pt")

with torch.no_grad(): ... logits = model(**inputs).logits

predicted_class_ids = torch.arange(0, logits.shape[-1])[torch.sigmoid(logits).squeeze(dim=0) > 0.5]

num_labels = len(model.config.id2label) model = T5ForSequenceClassification.from_pretrained( ... "google-t5/t5-small", num_labels=num_labels, problem_type="multi_label_classification" ... )

labels = torch.sum( ... torch.nn.functional.one_hot(predicted_class_ids[None, :].clone(), num_classes=num_labels), dim=1 ... ).to(torch.float) loss = model(**inputs, labels=labels).loss

T5ForTokenClassification

class transformers.T5ForTokenClassification

< source >

( config: T5Config )

Parameters

The T5 transformer with a token classification head on top (a linear layer on top of the hidden-states output) e.g. for Named-Entity-Recognition (NER) tasks.

This model inherits from PreTrainedModel. Check the superclass documentation for the generic methods the library implements for all its model (such as downloading or saving, resizing the input embeddings, pruning heads etc.)

This model is also a PyTorch torch.nn.Module subclass. Use it as a regular PyTorch Module and refer to the PyTorch documentation for all matter related to general usage and behavior.

forward

< source >

( input_ids: typing.Optional[torch.Tensor] = None attention_mask: typing.Optional[torch.Tensor] = None head_mask: typing.Optional[torch.Tensor] = None inputs_embeds: typing.Optional[torch.Tensor] = None labels: typing.Optional[torch.Tensor] = None output_attentions: typing.Optional[bool] = None output_hidden_states: typing.Optional[bool] = None return_dict: typing.Optional[bool] = None ) → transformers.modeling_outputs.TokenClassifierOutput or tuple(torch.FloatTensor)

Parameters

A transformers.modeling_outputs.TokenClassifierOutput or a tuple oftorch.FloatTensor (if return_dict=False is passed or when config.return_dict=False) comprising various elements depending on the configuration (T5Config) and inputs.

The T5ForTokenClassification forward method, overrides the __call__ special method.

Although the recipe for forward pass needs to be defined within this function, one should call the Moduleinstance afterwards instead of this since the former takes care of running the pre and post processing steps while the latter silently ignores them.

Example:

from transformers import AutoTokenizer, T5ForTokenClassification import torch

tokenizer = AutoTokenizer.from_pretrained("google-t5/t5-small") model = T5ForTokenClassification.from_pretrained("google-t5/t5-small")

inputs = tokenizer( ... "HuggingFace is a company based in Paris and New York", add_special_tokens=False, return_tensors="pt" ... )

with torch.no_grad(): ... logits = model(**inputs).logits

predicted_token_class_ids = logits.argmax(-1)

predicted_tokens_classes = [model.config.id2label[t.item()] for t in predicted_token_class_ids[0]] predicted_tokens_classes ...

labels = predicted_token_class_ids loss = model(**inputs, labels=labels).loss round(loss.item(), 2) ...

T5ForQuestionAnswering

class transformers.T5ForQuestionAnswering

< source >

( config: T5Config )

Parameters

The T5 transformer with a span classification head on top for extractive question-answering tasks like SQuAD (a linear layer on top of the hidden-states output to compute span start logits and span end logits).

This model inherits from PreTrainedModel. Check the superclass documentation for the generic methods the library implements for all its model (such as downloading or saving, resizing the input embeddings, pruning heads etc.)

This model is also a PyTorch torch.nn.Module subclass. Use it as a regular PyTorch Module and refer to the PyTorch documentation for all matter related to general usage and behavior.

forward

< source >

( input_ids: typing.Optional[torch.LongTensor] = None attention_mask: typing.Optional[torch.FloatTensor] = None decoder_input_ids: typing.Optional[torch.LongTensor] = None decoder_attention_mask: typing.Optional[torch.BoolTensor] = None head_mask: typing.Optional[torch.FloatTensor] = None decoder_head_mask: typing.Optional[torch.FloatTensor] = None cross_attn_head_mask: typing.Optional[torch.Tensor] = None encoder_outputs: typing.Optional[tuple[tuple[torch.Tensor]]] = None start_positions: typing.Optional[torch.LongTensor] = None end_positions: typing.Optional[torch.LongTensor] = None inputs_embeds: typing.Optional[torch.FloatTensor] = None decoder_inputs_embeds: typing.Optional[torch.FloatTensor] = None use_cache: typing.Optional[bool] = None output_attentions: typing.Optional[bool] = None output_hidden_states: typing.Optional[bool] = None return_dict: typing.Optional[bool] = None ) → transformers.modeling_outputs.Seq2SeqQuestionAnsweringModelOutput or tuple(torch.FloatTensor)

Parameters

A transformers.modeling_outputs.Seq2SeqQuestionAnsweringModelOutput or a tuple oftorch.FloatTensor (if return_dict=False is passed or when config.return_dict=False) comprising various elements depending on the configuration (T5Config) and inputs.

The T5ForQuestionAnswering forward method, overrides the __call__ special method.

Although the recipe for forward pass needs to be defined within this function, one should call the Moduleinstance afterwards instead of this since the former takes care of running the pre and post processing steps while the latter silently ignores them.

Example:

from transformers import AutoTokenizer, T5ForQuestionAnswering import torch

tokenizer = AutoTokenizer.from_pretrained("google-t5/t5-small") model = T5ForQuestionAnswering.from_pretrained("google-t5/t5-small")

question, text = "Who was Jim Henson?", "Jim Henson was a nice puppet"

inputs = tokenizer(question, text, return_tensors="pt") with torch.no_grad(): ... outputs = model(**inputs)

answer_start_index = outputs.start_logits.argmax() answer_end_index = outputs.end_logits.argmax()

predict_answer_tokens = inputs.input_ids[0, answer_start_index : answer_end_index + 1] tokenizer.decode(predict_answer_tokens, skip_special_tokens=True) ...

target_start_index = torch.tensor([14]) target_end_index = torch.tensor([15])

outputs = model(**inputs, start_positions=target_start_index, end_positions=target_end_index) loss = outputs.loss round(loss.item(), 2) ...