Disable parallel tool calls final answer by aymeric-roucher · Pull Request #1539 · huggingface/smolagents (original) (raw)

Until now in ToolCallingAgent, it's possible to call other tools in parallel with final_answer tool, so some weaker LLMs tend to call in the same call a web search tool, then final_answer tool, with misled expectations that these calls would be run sequentially and that the final answer would be informed by previous searches, when instead running these calls in parallel should just means that other tool calls than final_answer have no impact on the agent's return and are effectively useless, and that the LLM ends up force filling the final_answer() args with a hallucination.

Example: LLM returns this action, where the final answer is hallucinated instead of using the web search output:

Let's get results

{"name": "get_weather", "arguments": {"city": "Prague"}} {"name": "final_answer", "arguments": {"answer": "The weather is sunny in Prague"}}

To avoid this failure case, this PR forbids calling other tools in parallel with final_answer tool.