pruning — Model Optimizer 0.27.1 (original) (raw)

High-level API to automatically prune and optimize your model with various algorithms.

Functions

prune Prune a given model by searching for the best architecture within the design space.

prune(model, mode, constraints, dummy_input, config=None)

Prune a given model by searching for the best architecture within the design space.

Parameters:

constraints = {"flops": 4.5e6}

Specify a percentage-based constraint

(e.g., search for a model with <= 60% of the original model params)

constraints = {"params": "60%"}

Specify export_config with pruned hyperparameters

This is supported and required if the model is converted via mcore_gpt_minitron mode.

constraints = {
"export_config": {
"ffn_hidden_size": 128,
"num_attention_heads": 16,
"num_query_groups": 4,
}
}

Return type:

_tuple_[Module, _dict_[str, _Any_]]

Returns: A tuple (subnet, state_dict) where

subnet is the searched subnet (nn.Module), which can be used for subsequent tasks like fine-tuning, state_dict contains the history and detailed stats of the search procedure.

Note

The given model is modified (exported) in-place to match the best subnet found by the search algorithm. The returned subnet is thus a reference to the same model instance as the input model.