OpenAI Assistants with persistent vector store | Astra DB Serverless | DataStax Docs (original) (raw)

The Astra Assistants API is a drop in replacement for the OpenAI Assistants API. The Astra Assistants API uses a Serverless (vector) database for persistence, and it supports the following features:

Full compatibility with the OpenAI Assistants API v2, including messages, assistants, threads, runs, vector_stores, and files.
Third-party embeddings and completion models with hundreds of LLMs, including Anthropic, Gemini, Mistral, Groq, LLama, and Cohere, powered by liteLLM.
Ollama support for local models.
Open source with options for managed service or self hosting.
Function calling and file search.
Data privacy and protection.

The database stores and queries embeddings for retrieval augmented generation (RAG). For large language model (LLM) tasks, such as embedding generation and chat completion, the database calls OpenAI or other LLMs.

Users interact with the service through the OpenAI SDKs. Store your proprietary data and run assistant API examples on your own Astra DB Serverless database, which can be managed, accessed, and secured.

Set environment variables

Create a .env file for this tutorial.
In your .env file, set an environment variable for your Astra application token:
.env

#!/bin/bash  
export APPLICATION_TOKEN=

Set an environment variable for your OpenAI API key:
Set any additional environment variables required to authorize specific models:
.env

# Amazon Bedrock models: https://docs.aws.amazon.com/bedrock/latest/userguide/setting-up.html  
export AWS_REGION_NAME=  
export AWS_ACCESS_KEY_ID=  
export AWS_SECRET_ACCESS_KEY=  
# Anthropic Claude models - https://console.anthropic.com/settings/keys  
export ANTHROPIC_API_KEY=""  
# AI21 models  
export AI21_API_KEY=""  
# Aleph Alpha models  
export ALEPHALPHA_API_KEY=""  
# Anyscale models  
export ANYSCALE_API_KEY=""  
# Azure models  
export AZURE_API_KEY=""  
export AZURE_API_BASE=""  
export AZURE_API_VERSION=""  
export AZURE_AD_TOKEN=""  
export AZURE_API_TYPE=""  
# Baseten models  
export BASETEN_API_KEY=""  
# Cloudflare Workers models  
export CLOUDFLARE_API_KEY=""  
export CLOUDFLARE_ACCOUNT_ID=""  
# Cohere - Go to https://dashboard.cohere.com/api-keys to create an API key.  
export COHERE_API_KEY=  
# DeepInfra models  
export DEEPINFRA_API_KEY=""  
# DeepSeek models  
export DEEPSEEK_API_KEY=""  
# Fireworks AI models  
export FIREWORKS_AI_API_KEY=""  
# Gemini models - https://makersuite.google.com/app/apikey  
export GEMINI_API_KEY=""  
# Groq models - https://console.groq.com/keys  
export GROQ_API_KEY=""  
# Hugging Face models  
export HUGGINGFACE_API_KEY=""  
export HUGGINGFACE_API_BASE=""  
# Mistral models  
export MISTRAL_API_KEY=""  
# NLP Cloud models  
export NLP_CLOUD_API_KEY=""  
# OpenAI  
# The following might be required for certain models  
export OPENAI_ORGANIZATION=""  
export OPENAI_API_BASE=""  
# OpenRouter models  
export OPENROUTER_API_KEY=""  
export OR_SITE_URL=""  
export OR_APP_NAME=""  
# PaLM models  
export PALM_API_KEY=""  
# Perplexity - Go to https://www.perplexity.ai/settings/api to generate a secret key.  
export PERPLEXITYAI_API_KEY=  
# Replicate models  
export REPLICATE_API_KEY=""  
# TogetherAI models  
export TOGETHERAI_API_KEY=""  
# Vertex AI - required environment variables depend on your project configuration and the model you want to use.  
# Some variables only apply when accessing private models or models hosted by third-party providers.  
# Core variables for https://console.cloud.google.com/vertex-ai:  
export GOOGLE_JSON_PATH=  
export GOOGLE_PROJECT_ID=  
# If using a third-party SDK that doesn't recognize GOOGLE_PROJECT_ID:  
export VERTEXAI_PROJECT=""  
# If required by a third-party SDK or you need to specify a region-specific Vertex endpoint:  
export VERTEXAI_LOCATION=""  
# If required, not auto-detected from your environment, or not using GOOGLE_JSON_PATH:  
export GOOGLE_APPLICATION_CREDENTIALS=""  
# Voyage models  
export VOYAGE_API_KEY=""  
# WatsonX models  
export WATSONX_URL=""  
export WATSONX_APIKEY=""  
export WATSONX_TOKEN=""  
export WATSONX_PROJECT_ID=""  
export WATSONX_DEPLOYMENT_SPACE_ID=""  
# XInference models  
export XINFERENCE_API_BASE=""  
export XINFERENCE_API_KEY=""

Install dependencies

Install poetry:

curl -sSL https://install.python-poetry.org | python3 -

Install the dependencies:

poetry install astra-assistants openai python-dotenv

Build the Assistants API-powered application

Import and patch your client:

from openai import OpenAI  
from astra_assistants import patch  
client = patch(OpenAI())

Using your token, the system creates an Astra DB Serverless database named assistant_api_db.The first request can take a few minutes to create your database. 2. Create your assistant:

assistant = client.beta.assistants.create(  
  instructions="You are a personal math tutor. When asked a math question, write and run code to answer the question.",  
  model="gpt-4o",  
)

By default, the service uses Astra DB Serverless as the vector store and OpenAI for embeddings and chat completion.

Enable third-party model support

By default, the API uses your Astra DB Serverless database as the vector store and OpenAI for the embeddings and chat completion.

To enable these models, pass the provider’s API key using api-key and embedding-model headers, as shown in the following examples. You can set the required API keys as environment variables.

Language models

Anthropic Claude

model="anthropic/claude-3-5-sonnet"

assistant = client.beta.assistants.create(
    name="Math Tutor",
    instructions="You are a personal math tutor. Answer questions briefly, in a sentence or less.",
    model=model,
)

Cohere Command

model="cohere/command-r-plus"

assistant = client.beta.assistants.create(
    name="Math Tutor",
    instructions="You are a personal math tutor. Answer questions briefly, in a sentence or less.",
    model=model,
)

Google Gemini

model="gemini/gemini-1.5-flash"

assistant = client.beta.assistants.create(
    name="Math Tutor",
    instructions="You are a personal math tutor. Answer questions briefly, in a sentence or less.",
    model=model,
)

OpenAI GPT-4o

model="gpt-4o"

assistant = client.beta.assistants.create(
    name="Math Tutor",
    instructions="You are a personal math tutor. Answer questions briefly, in a sentence or less.",
    model=model,
)

OpenAI GPT-4o-mini

model="openai/gpt-4o-mini"

assistant = client.beta.assistants.create(
    name="Math Tutor",
    instructions="You are a personal math tutor. Answer questions briefly, in a sentence or less.",
    model=model,
)

Perplexity mistral-7B

model="perplexity/mixtral-8x7b-instruct"

assistant = client.beta.assistants.create(
    name="Math Tutor",
    instructions="You are a personal math tutor. Answer questions briefly, in a sentence or less.",
    model=model,
)

Perplexity llama2-70B

model="perplexity/pplx-70b-online"

assistant = client.beta.assistants.create(
    name="Math Tutor",
    instructions="You are a personal math tutor. Answer questions briefly, in a sentence or less.",
    model=model,
)

Embedding models

For third-party embedding models, DataStax supports the embedding_model in client.files.create:

file = client.files.create(
    file=open(
        "./test/language_models_are_unsupervised_multitask_learners.pdf",
        "rb",
    ),
    purpose="assistants",
    embedding_model="text-embedding-3-large",
)

Was this helpful?

Thank you for your feedback.

Give Feedback

How can we improve the documentation?