Agents — llama-stack documentation (original) (raw)

An Agent in Llama Stack is a powerful abstraction that allows you to build complex AI applications.

The Llama Stack agent framework is built on a modular architecture that allows for flexible and powerful AI applications. This document explains the key components and how they work together.

Core Concepts

1. Agent Configuration

Agents are configured using the AgentConfig class, which includes:

from llama_stack_client import Agent

Create the agent

agent = Agent( llama_stack_client, model="meta-llama/Llama-3-70b-chat", instructions="You are a helpful assistant that can use tools to answer questions.", tools=["builtin::code_interpreter", "builtin::rag/knowledge_search"], )

2. Sessions

Agents maintain state through sessions, which represent a conversation thread:

Create a session

session_id = agent.create_session(session_name="My conversation")

3. Turns

Each interaction with an agent is called a “turn” and consists of:

from llama_stack_client import AgentEventLogger

Create a turn with streaming response

turn_response = agent.create_turn( session_id=session_id, messages=[{"role": "user", "content": "Tell me about Llama models"}], ) for log in AgentEventLogger().log(turn_response): log.print()

Non-Streaming

from rich.pretty import pprint

Non-streaming API

response = agent.create_turn( session_id=session_id, messages=[{"role": "user", "content": "Tell me about Llama models"}], stream=False, ) print("Inputs:") pprint(response.input_messages) print("Output:") pprint(response.output_message.content) print("Steps:") pprint(response.steps)

4. Steps

Each turn consists of multiple steps that represent the agent’s thought process:

Agent Execution Loop

Refer to the Agent Execution Loop for more details on what happens within an agent turn.