Utilities for Generation (original) (raw)

This page lists all the utility functions used by generate().

Generate Outputs

The output of generate() is an instance of a subclass ofModelOutput. This output is a data structure containing all the information returned by generate(), but that can also be used as tuple or dictionary.

Here’s an example:

from transformers import GPT2Tokenizer, GPT2LMHeadModel

tokenizer = GPT2Tokenizer.from_pretrained("openai-community/gpt2") model = GPT2LMHeadModel.from_pretrained("openai-community/gpt2")

inputs = tokenizer("Hello, my dog is cute and ", return_tensors="pt") generation_output = model.generate(**inputs, return_dict_in_generate=True, output_scores=True)

The generation_output object is a GenerateDecoderOnlyOutput, as we can see in the documentation of that class below, it means it has the following attributes:

sequences: the generated sequences of tokens
scores (optional): the prediction scores of the language modelling head, for each generation step
hidden_states (optional): the hidden states of the model, for each generation step
attentions (optional): the attention weights of the model, for each generation step

Here we have the scores since we passed along output_scores=True, but we don’t have hidden_states andattentions because we didn’t pass output_hidden_states=True or output_attentions=True.

You can access each attribute as you would usually do, and if that attribute has not been returned by the model, you will get None. Here for instance generation_output.scores are all the generated prediction scores of the language modeling head, and generation_output.attentions is None.

When using our generation_output object as a tuple, it only keeps the attributes that don’t have None values. Here, for instance, it has two elements, loss then logits, so

will return the tuple (generation_output.sequences, generation_output.scores) for instance.

When using our generation_output object as a dictionary, it only keeps the attributes that don’t have Nonevalues. Here, for instance, it has two keys that are sequences and scores.

We document here all output types.

Utilities for Generation (original) (raw)

Generate Outputs

PyTorch

class transformers.generation.GenerateDecoderOnlyOutput

class transformers.generation.GenerateEncoderDecoderOutput

class transformers.generation.GenerateBeamDecoderOnlyOutput

class transformers.generation.GenerateBeamEncoderDecoderOutput

TensorFlow

class transformers.generation.TFGreedySearchEncoderDecoderOutput

class transformers.generation.TFGreedySearchDecoderOnlyOutput

class transformers.generation.TFSampleEncoderDecoderOutput

class transformers.generation.TFSampleDecoderOnlyOutput

class transformers.generation.TFBeamSearchEncoderDecoderOutput

class transformers.generation.TFBeamSearchDecoderOnlyOutput

class transformers.generation.TFBeamSampleEncoderDecoderOutput

class transformers.generation.TFBeamSampleDecoderOnlyOutput

class transformers.generation.TFContrastiveSearchEncoderDecoderOutput

class transformers.generation.TFContrastiveSearchDecoderOnlyOutput

FLAX

class transformers.generation.FlaxSampleOutput

class transformers.generation.FlaxGreedySearchOutput

class transformers.generation.FlaxBeamSearchOutput

LogitsProcessor

PyTorch

class transformers.AlternatingCodebooksLogitsProcessor

__call__

class transformers.ClassifierFreeGuidanceLogitsProcessor

__call__

class transformers.EncoderNoRepeatNGramLogitsProcessor

__call__

class transformers.EncoderRepetitionPenaltyLogitsProcessor

With the encoder_repetition_penalty argument we can trigger this logits processor in generate, which can

promote the use of prompt tokens ("Bob" in this example)

__call__

class transformers.EpsilonLogitsWarper

__call__

class transformers.EtaLogitsWarper

__call__

class transformers.ExponentialDecayLengthPenalty

__call__

class transformers.ForcedBOSTokenLogitsProcessor

__call__

class transformers.ForcedEOSTokenLogitsProcessor

__call__

class transformers.HammingDiversityLogitsProcessor

__call__

class transformers.InfNanRemoveLogitsProcessor

__call__

class transformers.LogitNormalization

__call__

__call__

class transformers.LogitsProcessorList

__call__

class transformers.MinLengthLogitsProcessor

__call__

class transformers.MinNewTokensLengthLogitsProcessor

__call__

class transformers.MinPLogitsWarper

__call__

class transformers.NoBadWordsLogitsProcessor

__call__

class transformers.NoRepeatNGramLogitsProcessor

__call__

class transformers.PrefixConstrainedLogitsProcessor

__call__

class transformers.RepetitionPenaltyLogitsProcessor

__call__

class transformers.SequenceBiasLogitsProcessor

__call__

class transformers.SuppressTokensAtBeginLogitsProcessor

__call__

class transformers.SuppressTokensLogitsProcessor

__call__

class transformers.SynthIDTextWatermarkLogitsProcessor

__call__

class transformers.TemperatureLogitsWarper

__call__

class transformers.TopKLogitsWarper

__call__

class transformers.TopPLogitsWarper

call

call

call

With the `encoder_repetition_penalty` argument we can trigger this logits processor in `generate`, which can

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call