Timer — PyTorch Lightning 2.5.1.post0 documentation (original) (raw)

class lightning.pytorch.callbacks.Timer(duration=None, interval=Interval.step, verbose=True)[source]

Bases: Callback

The Timer callback tracks the time spent in the training, validation, and test loops and interrupts the Trainer if the given time limit for the training loop is reached.

Parameters:

Raises:

Example:

from lightning.pytorch import Trainer from lightning.pytorch.callbacks import Timer

stop training after 12 hours

timer = Timer(duration="00:12:00:00")

or provide a datetime.timedelta

from datetime import timedelta timer = Timer(duration=timedelta(weeks=1))

or provide a dictionary

timer = Timer(duration=dict(weeks=4, days=2))

force training to stop after given time limit

trainer = Trainer(callbacks=[timer])

query training/validation/test time (in seconds)

timer.time_elapsed("train") timer.start_time("validate") timer.end_time("test")

end_time(stage=RunningStage.TRAINING)[source]

Return the end time of a particular stage (in seconds)

Return type:

Optional[float]

load_state_dict(state_dict)[source]

Called when loading a checkpoint, implement to reload callback state given callback’s state_dict.

Parameters:

state_dict (dict[str, Any]) – the callback state returned by state_dict.

Return type:

None

on_fit_start(trainer, *args, **kwargs)[source]

Called when fit begins.

Return type:

None

on_test_end(trainer, pl_module)[source]

Called when the test ends.

Return type:

None

on_test_start(trainer, pl_module)[source]

Called when the test begins.

Return type:

None

on_train_batch_end(trainer, *args, **kwargs)[source]

Called when the train batch ends. :rtype: None

Note

The value outputs["loss"] here will be the normalized value w.r.t accumulate_grad_batches of the loss returned from training_step.

on_train_end(trainer, pl_module)[source]

Called when the train ends.

Return type:

None

on_train_epoch_end(trainer, *args, **kwargs)[source]

Called when the train epoch ends.

To access all batch outputs at the end of the epoch, you can cache step outputs as an attribute of thelightning.pytorch.core.LightningModule and access them in this hook:

class MyLightningModule(L.LightningModule): def init(self): super().init() self.training_step_outputs = []

def training_step(self):
    loss = ...
    self.training_step_outputs.append(loss)
    return loss

class MyCallback(L.Callback): def on_train_epoch_end(self, trainer, pl_module): # do something with all training_step outputs, for example: epoch_mean = torch.stack(pl_module.training_step_outputs).mean() pl_module.log("training_epoch_mean", epoch_mean) # free up the memory pl_module.training_step_outputs.clear()

Return type:

None

on_train_start(trainer, pl_module)[source]

Called when the train begins.

Return type:

None

on_validation_end(trainer, pl_module)[source]

Called when the validation loop ends.

Return type:

None

on_validation_start(trainer, pl_module)[source]

Called when the validation loop begins.

Return type:

None

start_time(stage=RunningStage.TRAINING)[source]

Return the start time of a particular stage (in seconds)

Return type:

Optional[float]

state_dict()[source]

Called when saving a checkpoint, implement to generate callback’s state_dict.

Return type:

dict[str, Any]

Returns:

A dictionary containing callback state.

time_elapsed(stage=RunningStage.TRAINING)[source]

Return the time elapsed for a particular stage (in seconds)

Return type:

float

time_remaining(stage=RunningStage.TRAINING)[source]

Return the time remaining for a particular stage (in seconds)

Return type:

Optional[float]