Keras callbacks (original) (raw)

When training a Transformers model with Keras, there are some library-specific callbacks available to automate common tasks:

class transformers.KerasMetricCallback

< source >

( metric_fn: typing.Callable eval_dataset: typing.Union[tensorflow.python.data.ops.dataset_ops.DatasetV2, numpy.ndarray, tensorflow.python.framework.tensor.Tensor, tuple, dict] output_cols: typing.Optional[list[str]] = None label_cols: typing.Optional[list[str]] = None batch_size: typing.Optional[int] = None predict_with_generate: bool = False use_xla_generation: bool = False generate_kwargs: typing.Optional[dict] = None )

Parameters

Callback to compute metrics at the end of every epoch. Unlike normal Keras metrics, these do not need to be compilable by TF. It is particularly useful for common NLP metrics like BLEU and ROUGE that require string operations or generation loops that cannot be compiled. Predictions (or generations) will be computed on theeval_dataset before being passed to the metric_fn in np.ndarray format. The metric_fn should compute metrics and return a dict mapping metric names to metric values.

We provide an example of a suitable metric_fn that computes ROUGE scores for a summarization model below. Note that this example skips some post-processing for readability and simplicity, and should probably not be used as-is!

from datasets import load_metric

rouge_metric = load_metric("rouge")

def rouge_fn(predictions, labels): decoded_predictions = tokenizer.batch_decode(predictions, skip_special_tokens=True) decoded_labels = tokenizer.batch_decode(labels, skip_special_tokens=True) result = rouge_metric.compute(predictions=decoded_predictions, references=decoded_labels) return {key: value.mid.fmeasure * 100 for key, value in result.items()}

The above function will return a dict containing values which will be logged like any other Keras metric:

{'rouge1': 37.4199, 'rouge2': 13.9768, 'rougeL': 34.361, 'rougeLsum': 35.0781

class transformers.PushToHubCallback

< source >

( output_dir: typing.Union[str, pathlib.Path] save_strategy: typing.Union[str, transformers.trainer_utils.IntervalStrategy] = 'epoch' save_steps: typing.Optional[int] = None tokenizer: typing.Optional[transformers.tokenization_utils_base.PreTrainedTokenizerBase] = None hub_model_id: typing.Optional[str] = None hub_token: typing.Optional[str] = None checkpoint: bool = False **model_card_args )

Parameters

Callback that will save and push the model to the Hub regularly. By default, it pushes once per epoch, but this can be changed with the save_strategy argument. Pushed models can be accessed like any other model on the hub, such as with the from_pretrained method.

from transformers.keras_callbacks import PushToHubCallback

push_to_hub_callback = PushToHubCallback( output_dir="./model_save", tokenizer=tokenizer, hub_model_id="gpt5-7xlarge", )

model.fit(train_dataset, callbacks=[push_to_hub_callback])