Hyperbolic | liteLLM (original) (raw)

Overview

Property	Details
Description	Hyperbolic provides access to the latest models at a fraction of legacy cloud costs, with OpenAI-compatible APIs for LLMs, image generation, and more.
Provider Route on LiteLLM	hyperbolic/
Link to Provider Doc	Hyperbolic Documentation ↗
Base URL	https://api.hyperbolic.xyz/v1
Supported Operations	/chat/completions

https://docs.hyperbolic.xyz

We support ALL Hyperbolic models, just set hyperbolic/ as a prefix when sending completion requests

Available Models

Language Models

Model	Description	Context Window	Pricing per 1M tokens
hyperbolic/deepseek-ai/DeepSeek-V3	DeepSeek V3 - Fast and efficient	131,072 tokens	$0.25
hyperbolic/deepseek-ai/DeepSeek-V3-0324	DeepSeek V3 March 2024 version	131,072 tokens	$0.25
hyperbolic/deepseek-ai/DeepSeek-R1	DeepSeek R1 - Reasoning model	131,072 tokens	$2.00
hyperbolic/deepseek-ai/DeepSeek-R1-0528	DeepSeek R1 May 2028 version	131,072 tokens	$0.25
hyperbolic/Qwen/Qwen2.5-72B-Instruct	Qwen 2.5 72B Instruct	131,072 tokens	$0.40
hyperbolic/Qwen/Qwen2.5-Coder-32B-Instruct	Qwen 2.5 Coder 32B for code generation	131,072 tokens	$0.20
hyperbolic/Qwen/Qwen3-235B-A22B	Qwen 3 235B A22B variant	131,072 tokens	$2.00
hyperbolic/Qwen/QwQ-32B	Qwen QwQ 32B	131,072 tokens	$0.20
hyperbolic/meta-llama/Llama-3.3-70B-Instruct	Llama 3.3 70B Instruct	131,072 tokens	$0.80
hyperbolic/meta-llama/Meta-Llama-3.1-405B-Instruct	Llama 3.1 405B Instruct	131,072 tokens	$5.00
hyperbolic/moonshotai/Kimi-K2-Instruct	Kimi K2 Instruct	131,072 tokens	$2.00

Required Variables

Environment Variables

os.environ["HYPERBOLIC_API_KEY"] = ""  # your Hyperbolic API key

Get your API key from Hyperbolic dashboard.

Usage - LiteLLM Python SDK

Non-streaming

Hyperbolic Non-streaming Completion

import os
import litellm
from litellm import completion

os.environ["HYPERBOLIC_API_KEY"] = ""  # your Hyperbolic API key

messages = [{"content": "What is the capital of France?", "role": "user"}]

# Hyperbolic call
response = completion(
    model="hyperbolic/Qwen/Qwen2.5-72B-Instruct", 
    messages=messages
)

print(response)

Streaming

Hyperbolic Streaming Completion

import os
import litellm
from litellm import completion

os.environ["HYPERBOLIC_API_KEY"] = ""  # your Hyperbolic API key

messages = [{"content": "Write a short poem about AI", "role": "user"}]

# Hyperbolic call with streaming
response = completion(
    model="hyperbolic/deepseek-ai/DeepSeek-V3", 
    messages=messages,
    stream=True
)

for chunk in response:
    print(chunk)

Function Calling

Hyperbolic Function Calling

import os
import litellm
from litellm import completion

os.environ["HYPERBOLIC_API_KEY"] = ""  # your Hyperbolic API key

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather in a location",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city and state, e.g. San Francisco, CA"
                    },
                    "unit": {
                        "type": "string",
                        "enum": ["celsius", "fahrenheit"]
                    }
                },
                "required": ["location"]
            }
        }
    }
]

response = completion(
    model="hyperbolic/deepseek-ai/DeepSeek-V3",
    messages=[{"role": "user", "content": "What's the weather like in New York?"}],
    tools=tools,
    tool_choice="auto"
)

print(response)

Usage - LiteLLM Proxy

Add the following to your LiteLLM Proxy configuration file:

config.yaml

model_list:
  - model_name: deepseek-fast
    litellm_params:
      model: hyperbolic/deepseek-ai/DeepSeek-V3
      api_key: os.environ/HYPERBOLIC_API_KEY

  - model_name: qwen-coder
    litellm_params:
      model: hyperbolic/Qwen/Qwen2.5-Coder-32B-Instruct
      api_key: os.environ/HYPERBOLIC_API_KEY

  - model_name: deepseek-reasoning
    litellm_params:
      model: hyperbolic/deepseek-ai/DeepSeek-R1
      api_key: os.environ/HYPERBOLIC_API_KEY

Start your LiteLLM Proxy server:

Start LiteLLM Proxy

litellm --config config.yaml

# RUNNING on http://0.0.0.0:4000

OpenAI SDK
LiteLLM SDK
cURL

Hyperbolic via Proxy - Non-streaming

from openai import OpenAI

# Initialize client with your proxy URL
client = OpenAI(
    base_url="http://localhost:4000",  # Your proxy URL
    api_key="your-proxy-api-key"       # Your proxy API key
)

# Non-streaming response
response = client.chat.completions.create(
    model="deepseek-fast",
    messages=[{"role": "user", "content": "Explain quantum computing in simple terms"}]
)

print(response.choices[0].message.content)

Hyperbolic via Proxy - Streaming

from openai import OpenAI

# Initialize client with your proxy URL
client = OpenAI(
    base_url="http://localhost:4000",  # Your proxy URL
    api_key="your-proxy-api-key"       # Your proxy API key
)

# Streaming response
response = client.chat.completions.create(
    model="qwen-coder",
    messages=[{"role": "user", "content": "Write a Python function to sort a list"}],
    stream=True
)

for chunk in response:
    if chunk.choices[0].delta.content is not None:
        print(chunk.choices[0].delta.content, end="")

For more detailed information on using the LiteLLM Proxy, see the LiteLLM Proxy documentation.

Supported OpenAI Parameters

Hyperbolic supports the following OpenAI-compatible parameters:

Parameter	Type	Description
messages	array	Required. Array of message objects with 'role' and 'content'
model	string	Required. Model ID (e.g., deepseek-ai/DeepSeek-V3, Qwen/Qwen2.5-72B-Instruct)
stream	boolean	Optional. Enable streaming responses
temperature	float	Optional. Sampling temperature (0.0 to 2.0)
top_p	float	Optional. Nucleus sampling parameter
max_tokens	integer	Optional. Maximum tokens to generate
frequency_penalty	float	Optional. Penalize frequent tokens
presence_penalty	float	Optional. Penalize tokens based on presence
stop	string/array	Optional. Stop sequences
n	integer	Optional. Number of completions to generate
tools	array	Optional. List of available tools/functions
tool_choice	string/object	Optional. Control tool/function calling
response_format	object	Optional. Response format specification
seed	integer	Optional. Random seed for reproducibility
user	string	Optional. User identifier

Advanced Usage

Custom API Base

If you're using a custom Hyperbolic deployment:

Custom API Base

import litellm

response = litellm.completion(
    model="hyperbolic/deepseek-ai/DeepSeek-V3",
    messages=[{"role": "user", "content": "Hello"}],
    api_base="https://your-custom-hyperbolic-endpoint.com/v1",
    api_key="your-api-key"
)

Rate Limits

Hyperbolic offers different tiers:

Basic: 60 requests per minute (RPM)
Pro: 600 RPM
Enterprise: Custom limits

Pricing

Hyperbolic offers competitive pay-as-you-go pricing with no hidden fees or long-term commitments. See the model table above for specific pricing per million tokens.

Precision Options

BF16: Best precision and performance, suitable for tasks where accuracy is critical
FP8: Optimized for efficiency and speed, ideal for high-throughput applications at lower cost

Hyperbolic | liteLLM (original) (raw)

Overview​

Available Models​

Language Models​

Required Variables​

Usage - LiteLLM Python SDK​

Non-streaming​

Streaming​

Function Calling​

Usage - LiteLLM Proxy​

Supported OpenAI Parameters​

Advanced Usage​

Custom API Base​

Rate Limits​

Pricing​

Precision Options​

Additional Resources​

Overview

Available Models

Language Models

Required Variables

Usage - LiteLLM Python SDK

Non-streaming

Streaming

Function Calling

Usage - LiteLLM Proxy

Supported OpenAI Parameters

Advanced Usage

Custom API Base

Rate Limits

Pricing

Precision Options

Additional Resources