speculative_decoding — Model Optimizer 0.31.0 (original) (raw)

User-facing API for converting a model into a modelopt.torch.speculative.MedusaModel.

Functions

convert Main conversion function to turn a base model into a speculative decoding model.

convert(model, mode)

Main conversion function to turn a base model into a speculative decoding model.

Parameters:

Returns:

An instance of MedusaModel <modelopt.torch.distill.MedusaModel orEagleModel <modelopt.torch.distill.EagleModel its subclass.

Return type:

Module