ray.tune.Tuner.restore — Ray 2.45.0 (original) (raw)

classmethod Tuner.restore(path: str, trainable: str | Callable | Type[Trainable] | BaseTrainer, resume_unfinished: bool = True, resume_errored: bool = False, restart_errored: bool = False, param_space: Dict[str, Any] | None = None, storage_filesystem: pyarrow.fs.FileSystem | None = None, _resume_config: ResumeConfig | None = None) → Tuner[source]#

Restores Tuner after a previously failed run.

All trials from the existing run will be added to the result table. The argument flags control how existing but unfinished or errored trials are resumed.

Finished trials are always added to the overview table. They will not be resumed.

Unfinished trials can be controlled with the resume_unfinished flag. If True (default), they will be continued. If False, they will be added as terminated trials (even if they were only created and never trained).

Errored trials can be controlled with the resume_errored andrestart_errored flags. The former will resume errored trials from their latest checkpoints. The latter will restart errored trials from scratch and prevent loading their last checkpoints.

Note

Restoring an experiment from a path that’s pointing to a _different_location than the original experiment path is supported. However, Ray Tune assumes that the full experiment directory is available (including checkpoints) so that it’s possible to resume trials from their latest state.

For example, if the original experiment path was run locally, then the results are uploaded to cloud storage, Ray Tune expects the full contents to be available in cloud storage if attempting to resume via Tuner.restore("s3://..."). The restored run will continue writing results to the same cloud storage location.

Parameters: