RAG (original) (raw)

TFRagModel

class transformers.TFRagModel

< source >

( config: Optional[PretrainedConfig] = None question_encoder: Optional[TFPreTrainedModel] = None generator: Optional[TFPreTrainedModel] = None retriever: Optional[RagRetriever] = None load_weight_prefix: Optional[str] = None **kwargs )

Parameters

The TFRagModel forward method, overrides the __call__ special method.

Although the recipe for forward pass needs to be defined within this function, one should call the Moduleinstance afterwards instead of this since the former takes care of running the pre and post processing steps while the latter silently ignores them.

RAG is a sequence-to-sequence model which encapsulates two core components: a question encoder and a generator. During a forward pass, we encode the input with the question encoder and pass it to the retriever to extract relevant context documents. The documents are then prepended to the input. Such contextualized inputs is passed to the generator.

The question encoder can be any autoencoding model, preferably TFDPRQuestionEncoder, and the generator can be any seq2seq model, preferably TFBartForConditionalGeneration.

The model can be initialized with a RagRetriever for end-to-end generation or used in combination with the outputs of a retriever in multiple steps---see examples for more details. The model is compatible any_autoencoding_ model as the question_encoder and any seq2seq model with language model head as the generator. It has been tested with TFDPRQuestionEncoder as the question_encoder and TFBartForConditionalGenerationas the generator.

This model inherits from TFPreTrainedModel. Check the superclass documentation for the generic methods the library implements for all its model (such as downloading or saving, resizing the input embeddings, pruning heads etc.)

This model is also a Tensorflow keras.Modelsubclass. Use it as a regular TF 2.0 Keras Model and refer to the TF 2.0 documentation for all matter related to general usage and behavior.

The model is in a developing state as it is now fully supports in eager-mode only, and may not be exported in SavedModel format.

call

< source >

( input_ids: TFModelInputType | None = None attention_mask: np.ndarray | tf.Tensor | None = None encoder_outputs: np.ndarray | tf.Tensor | None = None decoder_input_ids: np.ndarray | tf.Tensor | None = None decoder_attention_mask: np.ndarray | tf.Tensor | None = None past_key_values: Tuple[Tuple[Union[np.ndarray, tf.Tensor]]] | None = None doc_scores: np.ndarray | tf.Tensor | None = None context_input_ids: np.ndarray | tf.Tensor | None = None context_attention_mask: np.ndarray | tf.Tensor | None = None use_cache: bool | None = None output_attentions: bool | None = None output_hidden_states: bool | None = None output_retrieved: bool | None = None n_docs: int | None = None return_dict: bool | None = None training: bool = False **kwargs ) → transformers.models.rag.modeling_tf_rag.TFRetrievAugLMOutput or tuple(tf.Tensor)

Parameters

Returns

transformers.models.rag.modeling_tf_rag.TFRetrievAugLMOutput or tuple(tf.Tensor)

A transformers.models.rag.modeling_tf_rag.TFRetrievAugLMOutput or a tuple of tf.Tensor (ifreturn_dict=False is passed or when config.return_dict=False) comprising various elements depending on the configuration (RagConfig) and inputs.

The TFRagModel forward method, overrides the __call__ special method.

Although the recipe for forward pass needs to be defined within this function, one should call the Moduleinstance afterwards instead of this since the former takes care of running the pre and post processing steps while the latter silently ignores them.

Example:

from transformers import AutoTokenizer, RagRetriever, TFRagModel import torch

tokenizer = AutoTokenizer.from_pretrained("facebook/rag-token-base") retriever = RagRetriever.from_pretrained( ... "facebook/rag-token-base", index_name="exact", use_dummy_dataset=True ... )

model = TFRagModel.from_pretrained("facebook/rag-token-base", retriever=retriever, from_pt=True)

input_dict = tokenizer.prepare_seq2seq_batch( ... "How many people live in Paris?", "In Paris, there are 10 million people.", return_tensors="tf" ... ) input_ids = input_dict["input_ids"] outputs = model(input_ids)

TFRagSequenceForGeneration

class transformers.TFRagSequenceForGeneration

< source >

( config: Optional[PretrainedConfig] = None question_encoder: Optional[TFPreTrainedModel] = None generator: Optional[TFPreTrainedModel] = None retriever: Optional[RagRetriever] = None **kwargs )

Parameters

The TFRagSequenceForGeneration forward method, overrides the __call__ special method.

Although the recipe for forward pass needs to be defined within this function, one should call the Moduleinstance afterwards instead of this since the former takes care of running the pre and post processing steps while the latter silently ignores them.

A TF RAG-sequence model implementation. It performs RAG-sequence specific marginalization in the forward pass.

RAG is a sequence-to-sequence model which encapsulates two core components: a question encoder and a generator. During a forward pass, we encode the input with the question encoder and pass it to the retriever to extract relevant context documents. The documents are then prepended to the input. Such contextualized inputs is passed to the generator.

The question encoder can be any autoencoding model, preferably TFDPRQuestionEncoder, and the generator can be any seq2seq model, preferably TFBartForConditionalGeneration.

The model can be initialized with a RagRetriever for end-to-end generation or used in combination with the outputs of a retriever in multiple steps---see examples for more details. The model is compatible any_autoencoding_ model as the question_encoder and any seq2seq model with language model head as the generator. It has been tested with TFDPRQuestionEncoder as the question_encoder and TFBartForConditionalGenerationas the generator.

This model inherits from TFPreTrainedModel. Check the superclass documentation for the generic methods the library implements for all its model (such as downloading or saving, resizing the input embeddings, pruning heads etc.)

This model is also a Tensorflow keras.Modelsubclass. Use it as a regular TF 2.0 Keras Model and refer to the TF 2.0 documentation for all matter related to general usage and behavior.

The model is in a developing state as it is now fully supports in eager-mode only, and may not be exported in SavedModel format.

call

< source >

( input_ids: TFModelInputType | None = None attention_mask: np.ndarray | tf.Tensor | None = None decoder_input_ids: np.ndarray | tf.Tensor | None = None decoder_attention_mask: np.ndarray | tf.Tensor | None = None encoder_outputs: np.ndarray | tf.Tensor | None = None past_key_values: Optional[Tuple[Tuple[Union[np.ndarray, tf.Tensor]]]] = None doc_scores: np.ndarray | tf.Tensor | None = None context_input_ids: np.ndarray | tf.Tensor | None = None context_attention_mask: np.ndarray | tf.Tensor | None = None use_cache: Optional[bool] = None output_attentions: Optional[bool] = None output_hidden_states: Optional[bool] = None output_retrieved: Optional[bool] = None n_docs: Optional[int] = None exclude_bos_score: Optional[bool] = None labels: np.ndarray | tf.Tensor | None = None reduce_loss: Optional[bool] = None return_dict: Optional[bool] = None training: bool = False **kwargs ) → transformers.models.rag.modeling_tf_rag.TFRetrievAugLMMarginOutput or tuple(tf.Tensor)

Parameters

Returns

transformers.models.rag.modeling_tf_rag.TFRetrievAugLMMarginOutput or tuple(tf.Tensor)

A transformers.models.rag.modeling_tf_rag.TFRetrievAugLMMarginOutput or a tuple of tf.Tensor (ifreturn_dict=False is passed or when config.return_dict=False) comprising various elements depending on the configuration (RagConfig) and inputs.

The TFRagSequenceForGeneration forward method, overrides the __call__ special method.

Although the recipe for forward pass needs to be defined within this function, one should call the Moduleinstance afterwards instead of this since the former takes care of running the pre and post processing steps while the latter silently ignores them.

Example:

from transformers import AutoTokenizer, RagRetriever, TFRagSequenceForGeneration

tokenizer = AutoTokenizer.from_pretrained("facebook/rag-sequence-nq") retriever = RagRetriever.from_pretrained( ... "facebook/rag-sequence-nq", index_name="exact", use_dummy_dataset=True ... )

model = TFRagSequenceForGeneration.from_pretrained( ... "facebook/rag-sequence-nq", retriever=retriever, from_pt=True ... )

input_dict = tokenizer.prepare_seq2seq_batch( ... "How many people live in Paris?", "In Paris, there are 10 million people.", return_tensors="tf" ... ) outputs = model(input_dict, output_retrieved=True)

input_ids = input_dict["input_ids"] question_hidden_states = model.question_encoder(input_ids)[0]

docs_dict = retriever(input_ids.numpy(), question_hidden_states.numpy(), return_tensors="tf") doc_scores = tf.squeeze( ... tf.matmul( ... tf.expand_dims(question_hidden_states, axis=1), docs_dict["retrieved_doc_embeds"], transpose_b=True ... ), ... axis=1, ... )

outputs = model( ... inputs=None, ... context_input_ids=docs_dict["context_input_ids"], ... context_attention_mask=docs_dict["context_attention_mask"], ... doc_scores=doc_scores, ... decoder_input_ids=input_dict["labels"], ... )

generated = model.generate( ... context_input_ids=docs_dict["context_input_ids"], ... context_attention_mask=docs_dict["context_attention_mask"], ... doc_scores=doc_scores, ... ) generated_string = tokenizer.batch_decode(generated, skip_special_tokens=True)

generate

< source >

( input_ids: TFModelInputType | None = None attention_mask: tf.Tensor | None = None context_input_ids = None context_attention_mask = None doc_scores = None do_deduplication = None num_return_sequences = None num_beams = None n_docs = None **model_kwargs ) → tf.Tensor of shape (batch_size * num_return_sequences, sequence_length)

Parameters

Returns

tf.Tensor of shape (batch_size * num_return_sequences, sequence_length)

The generated sequences. The second dimension (sequence length) is either equal to max_length or shorter if all batches finished early due to the eos_token_id.

Implements RAG sequence “thorough” decoding. Read the generate()` documentation for more information on how to set other generate input parameters

TFRagTokenForGeneration

class transformers.TFRagTokenForGeneration

< source >

( config: Optional[PretrainedConfig] = None question_encoder: Optional[TFPreTrainedModel] = None generator: Optional[TFPreTrainedModel] = None retriever: Optional[RagRetriever] = None **kwargs )

Parameters

The TFRagTokenForGeneration forward method, overrides the __call__ special method.

Although the recipe for forward pass needs to be defined within this function, one should call the Moduleinstance afterwards instead of this since the former takes care of running the pre and post processing steps while the latter silently ignores them.

A TF RAG-token model implementation. It performs RAG-token specific marginalization in the forward pass.

RAG is a sequence-to-sequence model which encapsulates two core components: a question encoder and a generator. During a forward pass, we encode the input with the question encoder and pass it to the retriever to extract relevant context documents. The documents are then prepended to the input. Such contextualized inputs is passed to the generator.

The question encoder can be any autoencoding model, preferably TFDPRQuestionEncoder, and the generator can be any seq2seq model, preferably TFBartForConditionalGeneration.

The model can be initialized with a RagRetriever for end-to-end generation or used in combination with the outputs of a retriever in multiple steps---see examples for more details. The model is compatible any_autoencoding_ model as the question_encoder and any seq2seq model with language model head as the generator. It has been tested with TFDPRQuestionEncoder as the question_encoder and TFBartForConditionalGenerationas the generator.

This model inherits from TFPreTrainedModel. Check the superclass documentation for the generic methods the library implements for all its model (such as downloading or saving, resizing the input embeddings, pruning heads etc.)

This model is also a Tensorflow keras.Modelsubclass. Use it as a regular TF 2.0 Keras Model and refer to the TF 2.0 documentation for all matter related to general usage and behavior.

The model is in a developing state as it is now fully supports in eager-mode only, and may not be exported in SavedModel format.

call

< source >

( input_ids: TFModelInputType | None = None attention_mask: np.ndarray | tf.Tensor | None = None decoder_input_ids: np.ndarray | tf.Tensor | None = None decoder_attention_mask: np.ndarray | tf.Tensor | None = None encoder_outputs: np.ndarray | tf.Tensor | None = None past_key_values: Tuple[Tuple[Union[np.ndarray, tf.Tensor]]] | None = None doc_scores: np.ndarray | tf.Tensor | None = None context_input_ids: np.ndarray | tf.Tensor | None = None context_attention_mask: np.ndarray | tf.Tensor | None = None use_cache: bool | None = None output_attentions: bool | None = None output_hidden_states: bool | None = None output_retrieved: bool | None = None n_docs: int | None = None do_marginalize: bool | None = None labels: np.ndarray | tf.Tensor | None = None reduce_loss: bool | None = None return_dict: bool | None = None training: bool = False **kwargs ) → transformers.models.rag.modeling_tf_rag.TFRetrievAugLMMarginOutput or tuple(tf.Tensor)

Parameters

Returns

transformers.models.rag.modeling_tf_rag.TFRetrievAugLMMarginOutput or tuple(tf.Tensor)

A transformers.models.rag.modeling_tf_rag.TFRetrievAugLMMarginOutput or a tuple of tf.Tensor (ifreturn_dict=False is passed or when config.return_dict=False) comprising various elements depending on the configuration (RagConfig) and inputs.

The TFRagTokenForGeneration forward method, overrides the __call__ special method.

Although the recipe for forward pass needs to be defined within this function, one should call the Moduleinstance afterwards instead of this since the former takes care of running the pre and post processing steps while the latter silently ignores them.

Example:

import tensorflow as tf from transformers import AutoTokenizer, RagRetriever, TFRagTokenForGeneration

tokenizer = AutoTokenizer.from_pretrained("facebook/rag-token-nq") retriever = RagRetriever.from_pretrained( ... "facebook/rag-token-nq", index_name="exact", use_dummy_dataset=True ... )

model = TFRagTokenForGeneration.from_pretrained("facebook/rag-token-nq", retriever=retriever, from_pt=True)

input_dict = tokenizer.prepare_seq2seq_batch( ... "How many people live in Paris?", "In Paris, there are 10 million people.", return_tensors="tf" ... ) outputs = model(input_dict, output_retrieved=True)

input_ids = input_dict["input_ids"] question_hidden_states = model.question_encoder(input_ids)[0]

docs_dict = retriever(input_ids.numpy(), question_hidden_states.numpy(), return_tensors="tf") doc_scores = tf.squeeze( ... tf.matmul( ... tf.expand_dims(question_hidden_states, axis=1), docs_dict["retrieved_doc_embeds"], transpose_b=True ... ), ... axis=1, ... )

outputs = model( ... inputs=None, ... context_input_ids=docs_dict["context_input_ids"], ... context_attention_mask=docs_dict["context_attention_mask"], ... doc_scores=doc_scores, ... decoder_input_ids=input_dict["labels"], ... )

generated = model.generate( ... context_input_ids=docs_dict["context_input_ids"], ... context_attention_mask=docs_dict["context_attention_mask"], ... doc_scores=doc_scores, ... ) generated_string = tokenizer.batch_decode(generated, skip_special_tokens=True)

generate

< source >

( input_ids: TFModelInputType | None = None attention_mask: tf.Tensor | None = None context_input_ids = None context_attention_mask = None doc_scores = None n_docs = None generation_config = None logits_processor = [] **kwargs ) → tf.Tensor of shape (batch_size * num_return_sequences, sequence_length)

Parameters

Returns

tf.Tensor of shape (batch_size * num_return_sequences, sequence_length)

The generated sequences. The second dimension (sequence_length) is either equal to max_length or shorter if all batches finished early due to the eos_token_id.

Implements TFRAG token decoding.