tf.train.CheckpointManager  |  TensorFlow v2.0.0 (original) (raw)

tf.train.CheckpointManager

Stay organized with collections Save and categorize content based on your preferences.

Deletes old checkpoints.

View aliases

Compat aliases for migration

SeeMigration guide for more details.

tf.compat.v1.train.CheckpointManager

tf.train.CheckpointManager(
    checkpoint, directory, max_to_keep, keep_checkpoint_every_n_hours=None,
    checkpoint_name='ckpt'
)

Example usage:

import tensorflow as tf
checkpoint = tf.train.Checkpoint(optimizer=optimizer, model=model)
manager = tf.contrib.checkpoint.CheckpointManager(
    checkpoint, directory="/tmp/model", max_to_keep=5)
status = checkpoint.restore(manager.latest_checkpoint)
while True:
  # train
  manager.save()

CheckpointManager preserves its own state across instantiations (see the__init__ documentation for details). Only one should be active in a particular directory at a time.

Args
checkpoint The tf.train.Checkpoint instance to save and manage checkpoints for.
directory The path to a directory in which to write checkpoints. A special file named "checkpoint" is also written to this directory (in a human-readable text format) which contains the state of theCheckpointManager.
max_to_keep An integer, the number of checkpoints to keep. Unless preserved by keep_checkpoint_every_n_hours, checkpoints will be deleted from the active set, oldest first, until only max_to_keepcheckpoints remain. If None, no checkpoints are deleted and everything stays in the active set. Note that max_to_keep=None will keep all checkpoint paths in memory and in the checkpoint state protocol buffer on disk.
keep_checkpoint_every_n_hours Upon removal from the active set, a checkpoint will be preserved if it has been at leastkeep_checkpoint_every_n_hours since the last preserved checkpoint. The default setting of None does not preserve any checkpoints in this way.
checkpoint_name Custom name for the checkpoint file.
Raises
ValueError If max_to_keep is not a positive integer.
Attributes
checkpoints A list of managed checkpoints.Note that checkpoints saved due to keep_checkpoint_every_n_hours will not show up in this list (to avoid ever-growing filename lists).
latest_checkpoint The prefix of the most recent checkpoint in directory.Equivalent to tf.train.latest_checkpoint(directory) where directory is the constructor argument to CheckpointManager. Suitable for passing to tf.train.Checkpoint.restore to resume training.

Methods

save

View source

save(
    checkpoint_number=None
)

Creates a new checkpoint and manages it.

Args
checkpoint_number An optional integer, or an integer-dtype Variable orTensor, used to number the checkpoint. If None (default), checkpoints are numbered using checkpoint.save_counter. Even ifcheckpoint_number is provided, save_counter is still incremented. A user-provided checkpoint_number is not incremented even if it is aVariable.
Returns
The path to the new checkpoint. It is also recorded in the checkpointsand latest_checkpoint properties.