mcore_gpt_minitron — Model Optimizer 0.27.1 (original) (raw)
Module implementing top-level mcore_gpt_minitron
pruning handler for NVIDIA Megatron-Core / NeMo models.
Minitron pruning algorithm uses activation magnitudes to estimate importance of neurons / attention heads in the model. More details on Minitron pruning algorithm can be found here: https://arxiv.org/pdf/2407.14679
Actual implementation is at modelopt.torch.nas.plugins.megatron
.
Classes
MCoreGPTMinitronSearcher | Searcher for Minitron pruning algorithm. |
---|
Functions
get_supported_model_config_map | Get supported models (inside function to avoid circular imports). |
---|
class MCoreGPTMinitronSearcher
Bases: BaseSearcher
Searcher for Minitron pruning algorithm.
before_search()
Optional pre-processing steps before the search.
Return type:
None
property default_search_config_: dict[str, Any]_
Get the default config for the searcher.
property default_state_dict_: dict[str, Any]_
Return default state dict.
run_search()
Run actual search.
Return type:
None
sanitize_search_config(config)
Sanitize the search config dict.
Parameters:
config (dict [ str , Any ] | None) –
Return type:
_dict_[str, _Any_]
get_supported_model_config_map()
Get supported models (inside function to avoid circular imports).
Return type:
_dict_[type, _str_]