Runner — mmengine 0.10.7 documentation (original) (raw)

class mmengine.runner.Runner(model, work_dir, train_dataloader=None, val_dataloader=None, test_dataloader=None, train_cfg=None, val_cfg=None, test_cfg=None, auto_scale_lr=None, optim_wrapper=None, param_scheduler=None, val_evaluator=None, test_evaluator=None, default_hooks=None, custom_hooks=None, data_preprocessor=None, load_from=None, resume=False, launcher='none', env_cfg={'dist_cfg': {'backend': 'nccl'}}, log_processor=None, log_level='INFO', visualizer=None, default_scope='mmengine', randomness={'seed': None}, experiment_name=None, cfg=None)[source]

A training helper for PyTorch.

Runner object can be built from config by runner = Runner.from_cfg(cfg)where the cfg usually contains training, validation, and test-related configurations to build corresponding components. We usually use the same config to launch training, testing, and validation tasks. However, only some of these components are necessary at the same time, e.g., testing a model does not need training or validation-related components.

To avoid repeatedly modifying config, the construction of Runner adopts lazy initialization to only initialize components when they are going to be used. Therefore, the model is always initialized at the beginning, and training, validation, and, testing related components are only initialized when calling runner.train(), runner.val(), and runner.test(), respectively.

Parameters:

Note

Since PyTorch 2.0.0, you can enable torch.compile by passing incfg.compile = True. If you want to control compile options, you can pass a dict, e.g. cfg.compile = dict(backend='eager'). Refer to PyTorch API Documentation for more valid options.

Examples

from mmengine.runner import Runner cfg = dict( model=dict(type='ToyModel'), work_dir='path/of/work_dir', train_dataloader=dict( dataset=dict(type='ToyDataset'), sampler=dict(type='DefaultSampler', shuffle=True), batch_size=1, num_workers=0), val_dataloader=dict( dataset=dict(type='ToyDataset'), sampler=dict(type='DefaultSampler', shuffle=False), batch_size=1, num_workers=0), test_dataloader=dict( dataset=dict(type='ToyDataset'), sampler=dict(type='DefaultSampler', shuffle=False), batch_size=1, num_workers=0), auto_scale_lr=dict(base_batch_size=16, enable=False), optim_wrapper=dict(type='OptimizerWrapper', optimizer=dict( type='SGD', lr=0.01)), param_scheduler=dict(type='MultiStepLR', milestones=[1, 2]), val_evaluator=dict(type='ToyEvaluator'), test_evaluator=dict(type='ToyEvaluator'), train_cfg=dict(by_epoch=True, max_epochs=3, val_interval=1), val_cfg=dict(), test_cfg=dict(), custom_hooks=[], default_hooks=dict( timer=dict(type='IterTimerHook'), checkpoint=dict(type='CheckpointHook', interval=1), logger=dict(type='LoggerHook'), optimizer=dict(type='OptimizerHook', grad_clip=False), param_scheduler=dict(type='ParamSchedulerHook')), launcher='none', env_cfg=dict(dist_cfg=dict(backend='nccl')), log_processor=dict(window_size=20), visualizer=dict(type='Visualizer', vis_backends=[dict(type='LocalVisBackend', save_dir='temp_dir')]) ) runner = Runner.from_cfg(cfg) runner.train() runner.test()

static build_dataloader(dataloader, seed=None, diff_rank_seed=False)[source]

Build dataloader.

The method builds three components:

An example of dataloader:

dataloader = dict( dataset=dict(type='ToyDataset'), sampler=dict(type='DefaultSampler', shuffle=True), batch_size=1, num_workers=9 )

Parameters:

Returns:

DataLoader build from dataloader_cfg.

Return type:

Dataloader

build_evaluator(evaluator)[source]

Build evaluator.

Examples of evaluator:

evaluator could be a built Evaluator instance

evaluator = Evaluator(metrics=[ToyMetric()])

evaluator can also be a list of dict

evaluator = [ dict(type='ToyMetric1'), dict(type='ToyEvaluator2') ]

evaluator can also be a list of built metric

evaluator = [ToyMetric1(), ToyMetric2()]

evaluator can also be a dict with key metrics

evaluator = dict(metrics=ToyMetric())

metric is a list

evaluator = dict(metrics=[ToyMetric()])

Parameters:

evaluator (Evaluator or dict or list) – An Evaluator object or a config dict or list of config dict used to build an Evaluator.

Returns:

Evaluator build from evaluator.

Return type:

Evaluator

build_log_processor(log_processor)[source]

Build test log_processor.

Examples of log_processor:

# LogProcessor will be used log_processor = dict()

# custom log_processor log_processor = dict(type=’CustomLogProcessor’)

Parameters:

Returns:

Log processor object build fromlog_processor_cfg.

Return type:

LogProcessor

build_logger(log_level='INFO', log_file=None, **kwargs)[source]

Build a global asscessable MMLogger.

Parameters:

Returns:

A MMLogger object build from logger.

Return type:

MMLogger

build_message_hub(message_hub=None)[source]

Build a global asscessable MessageHub.

Parameters:

message_hub (dict, optional) – A dict to build MessageHub object. If not specified, default config will be used to build MessageHub object. Defaults to None.

Returns:

A MessageHub object build from message_hub.

Return type:

MessageHub

build_model(model)[source]

Build model.

If model is a dict, it will be used to build a nn.Module object. Else, if model is a nn.Module object it will be returned directly.

An example of model:

model = dict(type='ResNet')

Parameters:

model (nn.Module or dict) – A nn.Module object or a dict to build nn.Module object. If model is a nn.Module object, just returns itself.

Return type:

Module

Note

The returned model must implement train_step, test_stepif runner.train or runner.test will be called. Ifrunner.val will be called or val_cfg is configured, model must implement val_step.

Returns:

Model build from model.

Return type:

nn.Module

Parameters:

model (Module | Dict) –

build_optim_wrapper(optim_wrapper)[source]

Build optimizer wrapper.

If optim_wrapper is a config dict for only one optimizer, the keys must contain optimizer, and type is optional. It will build a OptimWrapper by default.

If optim_wrapper is a config dict for multiple optimizers, i.e., it has multiple keys and each key is for an optimizer wrapper. The constructor must be specified sinceDefaultOptimizerConstructor cannot handle the building of training with multiple optimizers.

If optim_wrapper is a dict of pre-built optimizer wrappers, i.e., each value of optim_wrapper represents an OptimWrapperinstance. build_optim_wrapper will directly build theOptimWrapperDict instance from optim_wrapper.

Parameters:

optim_wrapper (OptimWrapper or dict) – An OptimWrapper object or a dict to build OptimWrapper objects. If optim_wrapper is an OptimWrapper, just return an OptimizeWrapper instance.

Return type:

OptimWrapper | OptimWrapperDict

Note

For single optimizer training, if optim_wrapper is a config dict, type is optional(defaults to OptimWrapper) and it must contain optimizer to build the corresponding optimizer.

Examples

build an optimizer

optim_wrapper_cfg = dict(type='OptimWrapper', optimizer=dict( ... type='SGD', lr=0.01))

optim_wrapper_cfg = dict(optimizer=dict(type='SGD', lr=0.01))

is also valid.

optim_wrapper = runner.build_optim_wrapper(optim_wrapper_cfg) optim_wrapper Type: OptimWrapper accumulative_counts: 1 optimizer: SGD ( Parameter Group 0 dampening: 0 lr: 0.01 momentum: 0 nesterov: False weight_decay: 0 )

build optimizer without type

optim_wrapper_cfg = dict(optimizer=dict(type='SGD', lr=0.01)) optim_wrapper = runner.build_optim_wrapper(optim_wrapper_cfg) optim_wrapper Type: OptimWrapper accumulative_counts: 1 optimizer: SGD ( Parameter Group 0 dampening: 0 lr: 0.01 maximize: False momentum: 0 nesterov: False weight_decay: 0 )

build multiple optimizers

optim_wrapper_cfg = dict( ... generator=dict(type='OptimWrapper', optimizer=dict( ... type='SGD', lr=0.01)), ... discriminator=dict(type='OptimWrapper', optimizer=dict( ... type='Adam', lr=0.001)) ... # need to customize a multiple optimizer constructor ... constructor='CustomMultiOptimizerConstructor', ...) optim_wrapper = runner.optim_wrapper(optim_wrapper_cfg) optim_wrapper name: generator Type: OptimWrapper accumulative_counts: 1 optimizer: SGD ( Parameter Group 0 dampening: 0 lr: 0.1 momentum: 0 nesterov: False weight_decay: 0 ) name: discriminator Type: OptimWrapper accumulative_counts: 1 optimizer: 'discriminator': Adam ( Parameter Group 0 dampening: 0 lr: 0.02 momentum: 0 nesterov: False weight_decay: 0 )

Important

If you need to build multiple optimizers, you should implement a MultiOptimWrapperConstructor which gets parameters passed to corresponding optimizers and compose the OptimWrapperDict. More details about how to customize OptimizerConstructor can be found at optimizer-docs.

Returns:

Optimizer wrapper build from optimizer_cfg.

Return type:

OptimWrapper

Parameters:

optim_wrapper (Optimizer | OptimWrapper | Dict) –

build_param_scheduler(scheduler)[source]

Build parameter schedulers.

build_param_scheduler should be called afterbuild_optim_wrapper because the building logic will change according to the number of optimizers built by the runner. The cases are as below:

Parameters:

scheduler (_ParamScheduler or dict or list) – A Param Scheduler object or a dict or list of dict to build parameter schedulers.

Return type:

List[_ParamScheduler] | Dict[str, List[_ParamScheduler]]

Examples

build one scheduler

optim_cfg = dict(dict(type='SGD', lr=0.01)) runner.optim_wrapper = runner.build_optim_wrapper( optim_cfg) scheduler_cfg = dict(type='MultiStepLR', milestones=[1, 2]) schedulers = runner.build_param_scheduler(scheduler_cfg) schedulers [<mmengine.optim.scheduler.lr_scheduler.MultiStepLR at 0x7f70f6966290>] # noqa: E501

build multiple schedulers

scheduler_cfg = [ ... dict(type='MultiStepLR', milestones=[1, 2]), ... dict(type='StepLR', step_size=1) ... ] schedulers = runner.build_param_scheduler(scheduler_cfg) schedulers [<mmengine.optim.scheduler.lr_scheduler.MultiStepLR at 0x7f70f60dd3d0>, # noqa: E501 <mmengine.optim.scheduler.lr_scheduler.StepLR at 0x7f70f6eb6150>]

Above examples only provide the case of one optimizer and one scheduler or multiple schedulers. If you want to know how to set parameter scheduler when using multiple optimizers, you can find more examplesoptimizer-docs.

Returns:

List of parameter schedulers or a dictionary contains list of parameter schedulers build from scheduler.

Return type:

list[_ParamScheduler] or dict[str, list[_ParamScheduler]]

Parameters:

scheduler (_ParamScheduler | Dict | List) –

build_test_loop(loop)[source]

Build test loop.

Examples of loop:

TestLoop will be used

loop = dict()

custom test loop

loop = dict(type='CustomTestLoop')

Parameters:

loop (BaseLoop or dict) – A test loop or a dict to build test loop. If loop is a test loop object, just returns itself.

Returns:

Test loop object build from loop_cfg.

Return type:

BaseLoop

build_train_loop(loop)[source]

Build training loop.

Examples of loop:

EpochBasedTrainLoop will be used

loop = dict(by_epoch=True, max_epochs=3)

IterBasedTrainLoop will be used

loop = dict(by_epoch=False, max_epochs=3)

custom training loop

loop = dict(type='CustomTrainLoop', max_epochs=3)

Parameters:

loop (BaseLoop or dict) – A training loop or a dict to build training loop. If loop is a training loop object, just returns itself.

Returns:

Training loop object build from loop.

Return type:

BaseLoop

build_val_loop(loop)[source]

Build validation loop.

Examples of loop:

# ValLoop will be used loop = dict()

# custom validation loop loop = dict(type=’CustomValLoop’)

Parameters:

loop (BaseLoop or dict) – A validation loop or a dict to build validation loop. If loop is a validation loop object, just returns itself.

Returns:

Validation loop object build from loop.

Return type:

BaseLoop

build_visualizer(visualizer=None)[source]

Build a global asscessable Visualizer.

Parameters:

visualizer (Visualizer or dict, optional) – A Visualizer object or a dict to build Visualizer object. If visualizer is a Visualizer object, just returns itself. If not specified, default config will be used to build Visualizer object. Defaults to None.

Returns:

A Visualizer object build from visualizer.

Return type:

Visualizer

call_hook(fn_name, **kwargs)[source]

Call all hooks.

Parameters:

Return type:

None

property deterministic

Whether cudnn to select deterministic algorithms.

Type:

int

property distributed

Whether current environment is distributed.

Type:

bool

dump_config()[source]

Dump config to work_dir.

Return type:

None

property epoch

Current epoch.

Type:

int

property experiment_name

Name of experiment.

Type:

str

classmethod from_cfg(cfg)[source]

Build a runner from config.

Parameters:

cfg (ConfigType) – A config used for building runner. Keys ofcfg can see __init__().

Returns:

A runner build from cfg.

Return type:

Runner

property hooks

A list of registered hooks.

Type:

List[Hook]

property iter

Current iteration.

Type:

int

property launcher

Way to launcher multi processes.

Type:

str

load_checkpoint(filename, map_location='cpu', strict=False, revise_keys=[('^module.', '')])[source]

Load checkpoint from given filename.

Parameters:

load_or_resume()[source]

Load or resume checkpoint.

Return type:

None

property max_epochs

Total epochs to train model.

Type:

int

property max_iters

Total iterations to train model.

Type:

int

property model_name

Name of the model, usually the module class name.

Type:

str

property rank

Rank of current process.

Type:

int

register_custom_hooks(hooks)[source]

Register custom hooks into hook list.

Parameters:

hooks (list_[_Hook | dict]) – List of hooks or configs to be registered.

Return type:

None

register_default_hooks(hooks=None)[source]

Register default hooks into hook list.

hooks will be registered into runner to execute some default actions like updating model parameters or saving checkpoints.

Default hooks and their priorities:

Hooks Priority
RuntimeInfoHook VERY_HIGH (10)
IterTimerHook NORMAL (50)
DistSamplerSeedHook NORMAL (50)
LoggerHook BELOW_NORMAL (60)
ParamSchedulerHook LOW (70)
CheckpointHook VERY_LOW (90)

If hooks is None, above hooks will be registered by default:

default_hooks = dict( runtime_info=dict(type='RuntimeInfoHook'), timer=dict(type='IterTimerHook'), sampler_seed=dict(type='DistSamplerSeedHook'), logger=dict(type='LoggerHook'), param_scheduler=dict(type='ParamSchedulerHook'), checkpoint=dict(type='CheckpointHook', interval=1), )

If not None, hooks will be merged into default_hooks. If there are None value in default_hooks, the corresponding item will be popped from default_hooks:

The final registered default hooks will be RuntimeInfoHook,DistSamplerSeedHook, LoggerHook,ParamSchedulerHook and CheckpointHook.

Parameters:

hooks (dict[_str,_ Hook or dict] , optional) – Default hooks or configs to be registered.

Return type:

None

register_hook(hook, priority=None)[source]

Register a hook into the hook list.

The hook will be inserted into a priority queue, with the specified priority (See Priority for details of priorities). For hooks with the same priority, they will be triggered in the same order as they are registered.

Priority of hook will be decided with the following priority:

Parameters:

Return type:

None

register_hooks(default_hooks=None, custom_hooks=None)[source]

Register default hooks and custom hooks into hook list.

Parameters:

Return type:

None

resume(filename, resume_optimizer=True, resume_param_scheduler=True, map_location='default')[source]

Resume model from checkpoint.

Parameters:

Return type:

None

save_checkpoint(out_dir, filename, file_client_args=None, save_optimizer=True, save_param_scheduler=True, meta=None, by_epoch=True, backend_args=None)[source]

Save checkpoints.

CheckpointHook invokes this method to save checkpoints periodically.

Parameters:

scale_lr(optim_wrapper, auto_scale_lr=None)[source]

Automatically scaling learning rate in training according to the ratio of base_batch_size in autoscalelr_cfg and real batch size.

It scales the learning rate linearly according to thepaper.

Note

scale_lr must be called after building optimizer wrappers and before building parameter schedulers.

Parameters:

Return type:

None

property seed

A number to set random modules.

Type:

int

set_randomness(seed, diff_rank_seed=False, deterministic=False)[source]

Set random seed to guarantee reproducible results.

Parameters:

Return type:

None

setup_env(env_cfg)[source]

Setup environment.

An example of env_cfg:

env_cfg = dict( cudnn_benchmark=True, mp_cfg=dict( mp_start_method='fork', opencv_num_threads=0 ), dist_cfg=dict(backend='nccl', timeout=1800), resource_limit=4096 )

Parameters:

env_cfg (dict) – Config for setting environment.

Return type:

None

test()[source]

Launch test.

Returns:

A dict of metrics on testing set.

Return type:

dict

property test_dataloader

The data loader for testing.

property test_evaluator

An evaluator for testing.

Type:

Evaluator

property test_loop

A loop to run testing.

Type:

BaseLoop

property timestamp

Timestamp when creating experiment.

Type:

str

train()[source]

Launch training.

Returns:

The model after training.

Return type:

nn.Module

property train_dataloader

The data loader for training.

property train_loop

A loop to run training.

Type:

BaseLoop

val()[source]

Launch validation.

Returns:

A dict of metrics on validation set.

Return type:

dict

property val_begin

The epoch/iteration to start running validation during training.

Type:

int

property val_dataloader

The data loader for validation.

property val_evaluator

An evaluator for validation.

Type:

Evaluator

property val_interval

Interval to run validation during training.

Type:

int

property val_loop

A loop to run validation.

Type:

BaseLoop

property work_dir

The working directory to save checkpoints and logs.

Type:

str

property world_size

Number of processes participating in the job.

Type:

int

wrap_model(model_wrapper_cfg, model)[source]

Wrap the model to MMDistributedDataParallel or other custom distributed data-parallel module wrappers.

An example of model_wrapper_cfg:

model_wrapper_cfg = dict( broadcast_buffers=False, find_unused_parameters=False )

Parameters:

Returns:

nn.Module or subclass ofDistributedDataParallel.

Return type:

nn.Module or DistributedDataParallel