Transfer aggregation of streaming events off the Model class by aymeric-roucher · Pull Request #1449 · huggingface/smolagents (original) (raw)

This PR change the logic of streaming messages:

Previously, the logic was to accumulate streaming deltas in the Model classes, and yield objects that "contain all the generated text since start of streaming until now"
This PR makes the ModelClass directly return atomic streaming deltas, to handle the aggregation only within the agent.
The reason for this is that front-ends like copilotkit generally expect individual streaming deltas.

Additionally, it removes the HfApiModel class, which was deprecated and due for deletion in 1.17, and fuses Message into ChatMessage

"LiteLLMModel",
"LiteLLMRouterModel",
"OpenAIServerModel",
"OpenAIModel",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding a copy of the class with "Server" removed in the name for easier access

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

logger = getLogger(__name__)


class Message(TypedDict):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@albertvillanova since Message and ChatMessage were mostly interchangeable, I fuse them.
One potential difficulty to consider is that the class is not a TypedDict anymore, so cannot be considered a dict.
But this didn't really create any implementation problem so far: we just handle ChatMessage objects internally, and can handle the dict conversion in Model subclasses just before sending messages to inference.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK... I though the purpose of to_messages was to convert the steps into a format directly consumable by the model (so plain dicts instead of instance objects).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, good refactoring!


The `HfApiModel` wraps huggingface_hub's [InferenceClient](https://huggingface.co/docs/huggingface\_hub/main/en/guides/inference) for the execution of the LLM. It supports all [Inference Providers](https://huggingface.co/docs/inference-providers/index) available on the Hub: Cerebras, Cohere, Fal, Fireworks, HF-Inference, Hyperbolic, Nebius, Novita, Replicate, SambaNova, Together, and more.
The `InferenceClientModel` wraps huggingface_hub's [InferenceClient](https://huggingface.co/docs/huggingface\_hub/main/en/guides/inference) for the execution of the LLM. It supports all [Inference Providers](https://huggingface.co/docs/inference-providers/index) available on the Hub: Cerebras, Cohere, Fal, Fireworks, HF-Inference, Hyperbolic, Nebius, Novita, Replicate, SambaNova, Together, and more.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines -1422 to -1428

class HfApiModel(InferenceClientModel):
def __new__(cls, args, *kwargs):
warnings.warn(
"HfApiModel was renamed to InferenceClientModel in version 1.14.0 and will be removed in 1.17.0.",
FutureWarning,
)
return super().__new__(cls)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines +1619 to +1621

class OpenAIModel(OpenAIServerModel):
def __new__(cls, args, *kwargs):
return super().__new__(cls)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this just an alias or are you planning to deprecate OpenAIServerModel?

If this is just an alias and both are identically valid, then I would suggest:

OpenAIModel = OpenAIServerModel

If you are planning to deprecate OpenAIServerModel, then you should inherit inversely:

class OpenAIServerModel(OpenAIModel): def new(cls, *args, **kwargs): warnings.warn( "OpenAIServerModel was renamed to OpenAIModel in version 1.19.0 and will be removed in 1.22.0. " "Please use OpenAIModel instead.", FutureWarning, stacklevel=2 ) return super().new(cls)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's an alias : so I'll just copy it !

logger = getLogger(__name__)


class Message(TypedDict):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK... I though the purpose of to_messages was to convert the steps into a format directly consumable by the model (so plain dicts instead of instance objects).

Thank you for your comments! So if we think again of the distinction ChatMessage, and Message, Message is just a dict-like version of ChatMessage, it's a bit like a less complete and dict-converted version of ChatMessage, thus the fusion of the two.
to_messages is a way to convert memory steps to chat messages, it's not particularly expected for these messages to already be dictionaries.

This was referenced

Jun 30, 2025

This was referenced

Jul 8, 2025

This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters

[ Show hidden characters]({{ revealButtonHref }})