Runner — mmengine 0.10.7 documentation (original) (raw)

build_train_loop(loop)[source]¶

Build training loop.

Examples of loop:

`EpochBasedTrainLoop` will be used

loop = dict(by_epoch=True, max_epochs=3)

`IterBasedTrainLoop` will be used

loop = dict(by_epoch=False, max_epochs=3)

custom training loop

loop = dict(type='CustomTrainLoop', max_epochs=3)

Parameters:

loop (BaseLoop or dict) – A training loop or a dict to build training loop. If loop is a training loop object, just returns itself.

Returns:

Training loop object build from loop.

Return type:

build_val_loop(loop)[source]¶

Build validation loop.

Examples of loop:

# ValLoop will be used loop = dict()

# custom validation loop loop = dict(type=’CustomValLoop’)

Parameters:

loop (BaseLoop or dict) – A validation loop or a dict to build validation loop. If loop is a validation loop object, just returns itself.

Returns:

Validation loop object build from loop.

Return type:

build_visualizer(visualizer=None)[source]¶

Build a global asscessable Visualizer.

Parameters:

visualizer (Visualizer or dict, optional) – A Visualizer object or a dict to build Visualizer object. If visualizer is a Visualizer object, just returns itself. If not specified, default config will be used to build Visualizer object. Defaults to None.

Returns:

A Visualizer object build from visualizer.

Return type:

Visualizer

call_hook(fn_name, **kwargs)[source]¶

Call all hooks.

Parameters:

fn_name (str) – The function name in each hook to be called, such as “before_train_epoch”.
**kwargs – Keyword arguments passed to hook.

Return type:

None

property deterministic¶

Whether cudnn to select deterministic algorithms.

Type:

property distributed¶

Whether current environment is distributed.

Type:

bool

dump_config()[source]¶

Dump config to work_dir.

Return type:

None

property epoch¶

Current epoch.

Type:

property experiment_name¶

Name of experiment.

Type:

classmethod from_cfg(cfg)[source]¶

Build a runner from config.

Parameters:

cfg (ConfigType) – A config used for building runner. Keys ofcfg can see __init__().

Returns:

A runner build from cfg.

Return type:

Runner

property hooks¶

A list of registered hooks.

Type:

List[Hook]

property iter¶

Current iteration.

Type:

property launcher¶

Way to launcher multi processes.

Type:

load_checkpoint(filename, map_location='cpu', strict=False, revise_keys=[('^module.', '')])[source]¶

Load checkpoint from given filename.

Parameters:

filename (str) – Accept local filepath, URL, torchvision://xxx,open-mmlab://xxx.
map_location (str or callable) – A string or a callable function to specifying how to remap storage locations. Defaults to ‘cpu’.
strict (bool) – strict (bool): Whether to allow different params for the model and checkpoint.
revise_keys (list) – A list of customized keywords to modify the state_dict in checkpoint. Each item is a (pattern, replacement) pair of the regular expression operations. Defaults to strip the prefix ‘module.’ by [(r’^module.’, ‘’)].

load_or_resume()[source]¶

Load or resume checkpoint.

Return type:

None

property max_epochs¶

Total epochs to train model.

Type:

property max_iters¶

Total iterations to train model.

Type:

property model_name¶

Name of the model, usually the module class name.

Type:

property rank¶

Rank of current process.

Type:

register_custom_hooks(hooks)[source]¶

Parameters:

hooks (list_[_Hook | dict]) – List of hooks or configs to be registered.

Return type:

None

register_default_hooks(hooks=None)[source]¶

hooks will be registered into runner to execute some default actions like updating model parameters or saving checkpoints.

Default hooks and their priorities:

Hooks	Priority
RuntimeInfoHook	VERY_HIGH (10)
IterTimerHook	NORMAL (50)
DistSamplerSeedHook	NORMAL (50)
LoggerHook	BELOW_NORMAL (60)
ParamSchedulerHook	LOW (70)
CheckpointHook	VERY_LOW (90)

If hooks is None, above hooks will be registered by default:

default_hooks = dict( runtime_info=dict(type='RuntimeInfoHook'), timer=dict(type='IterTimerHook'), sampler_seed=dict(type='DistSamplerSeedHook'), logger=dict(type='LoggerHook'), param_scheduler=dict(type='ParamSchedulerHook'), checkpoint=dict(type='CheckpointHook', interval=1), )

If not None, hooks will be merged into default_hooks. If there are None value in default_hooks, the corresponding item will be popped from default_hooks:

The final registered default hooks will be RuntimeInfoHook,DistSamplerSeedHook, LoggerHook,ParamSchedulerHook and CheckpointHook.

Parameters:

hooks (dict[_str,_ Hook or dict] , optional) – Default hooks or configs to be registered.

Return type:

None

register_hook(hook, priority=None)[source]¶

The hook will be inserted into a priority queue, with the specified priority (See Priority for details of priorities). For hooks with the same priority, they will be triggered in the same order as they are registered.

Priority of hook will be decided with the following priority:

priority argument. If priority is given, it will be priority of hook.
If hook argument is a dict and priority in it, the priority will be the value of hook['priority'].
If hook argument is a dict but priority not in it or hookis an instance of hook, the priority will be hook.priority.

Parameters:

hook (Hook or dict) – The hook to be registered.
priority (int or str or Priority, optional) – Hook priority. Lower value means higher priority.

Return type:

None

register_hooks(default_hooks=None, custom_hooks=None)[source]¶

Parameters:

default_hooks (dict[_str,_ dict] or dict[_str,_ Hook] , optional) – Hooks to execute default actions like updating model parameters and saving checkpoints. Defaults to None.
custom_hooks (list_[_dict] or list_[_Hook] , optional) – Hooks to execute custom actions like visualizing images processed by pipeline. Defaults to None.

Return type:

None

resume(filename, resume_optimizer=True, resume_param_scheduler=True, map_location='default')[source]¶

Resume model from checkpoint.

Parameters:

filename (str) – Accept local filepath, URL, torchvision://xxx,open-mmlab://xxx.
resume_optimizer (bool) – Whether to resume optimizer state. Defaults to True.
resume_param_scheduler (bool) – Whether to resume param scheduler state. Defaults to True.
map_location (str or callable) – A string or a callable function to specifying how to remap storage locations. Defaults to ‘default’.

Return type:

None

save_checkpoint(out_dir, filename, file_client_args=None, save_optimizer=True, save_param_scheduler=True, meta=None, by_epoch=True, backend_args=None)[source]¶

Save checkpoints.

CheckpointHook invokes this method to save checkpoints periodically.

Parameters:

out_dir (str) – The directory that checkpoints are saved.
filename (str) – The checkpoint filename.
file_client_args (dict, optional) – Arguments to instantiate a FileClient. See mmengine.fileio.FileClient for details. Defaults to None. It will be deprecated in future. Please use backend_args instead.
save_optimizer (bool) – Whether to save the optimizer to the checkpoint. Defaults to True.
save_param_scheduler (bool) – Whether to save the param_scheduler to the checkpoint. Defaults to True.
meta (dict, optional) – The meta information to be saved in the checkpoint. Defaults to None.
by_epoch (bool) – Decide the number of epoch or iteration saved in checkpoint. Defaults to True.
backend_args (dict, optional) – Arguments to instantiate the prefix of uri corresponding backend. Defaults to None. New in v0.2.0.

scale_lr(optim_wrapper, auto_scale_lr=None)[source]¶

Automatically scaling learning rate in training according to the ratio of base_batch_size in autoscalelr_cfg and real batch size.

It scales the learning rate linearly according to thepaper.

Note

scale_lr must be called after building optimizer wrappers and before building parameter schedulers.

Parameters:

optim_wrapper (OptimWrapper) – An OptimWrapper object whose parameter groups’ learning rate need to be scaled.
auto_scale_lr (Dict , Optional) – Config to scale the learning rate automatically. It includes base_batch_size andenable. base_batch_size is the batch size that the optimizer lr is based on. enable is the switch to turn on and off the feature.

Return type:

None

property seed¶

A number to set random modules.

Type:

set_randomness(seed, diff_rank_seed=False, deterministic=False)[source]¶

Set random seed to guarantee reproducible results.

Parameters:

seed (int) – A number to set random modules.
diff_rank_seed (bool) – Whether or not set different seeds according to global rank. Defaults to False.
deterministic (bool) – Whether to set the deterministic option for CUDNN backend, i.e., set torch.backends.cudnn.deterministicto True and torch.backends.cudnn.benchmark to False. Defaults to False. See https://pytorch.org/docs/stable/notes/randomness.html for more details.

Return type:

None

setup_env(env_cfg)[source]¶

Setup environment.

An example of env_cfg:

env_cfg = dict( cudnn_benchmark=True, mp_cfg=dict( mp_start_method='fork', opencv_num_threads=0 ), dist_cfg=dict(backend='nccl', timeout=1800), resource_limit=4096 )

Parameters:

env_cfg (dict) – Config for setting environment.

Return type:

None

test()[source]¶

Launch test.

Returns:

A dict of metrics on testing set.

Return type:

dict

property test_dataloader¶

The data loader for testing.

property test_evaluator¶

An evaluator for testing.

Type:

Evaluator

property test_loop¶

A loop to run testing.

Type:

property timestamp¶

Timestamp when creating experiment.

Type:

train()[source]¶

Launch training.

Returns:

The model after training.

Return type:

nn.Module

property train_dataloader¶

The data loader for training.

property train_loop¶

A loop to run training.

Type:

val()[source]¶

Launch validation.

Returns:

A dict of metrics on validation set.

Return type:

dict

property val_begin¶

The epoch/iteration to start running validation during training.

Type:

property val_dataloader¶

The data loader for validation.

property val_evaluator¶

An evaluator for validation.

Type:

Evaluator

property val_interval¶

Interval to run validation during training.

Type:

property val_loop¶

A loop to run validation.

Type:

property work_dir¶

The working directory to save checkpoints and logs.

Type:

property world_size¶

Number of processes participating in the job.

Type: