Contact Us | Together AI (original) (raw)

For founders and builders defining the AI-native era. Register now →

🔎 ATLAS: runtime-learning accelerators delivering up to 4x faster LLM inference →

⚡ Together Instant Clusters: self-service NVIDIA GPUs, now generally available →

📦 Batch Inference API: Process billions of tokens at 50% lower cost for most models →

🪛 Fine-Tuning Platform Upgrades: Larger Models, Longer Contexts →

Model Platform

Model Platform

Products

Serverless InferenceAPI for inference on open-source models Dedicated EndpointsDeploy models on custom hardware Fine-TuningTrain & improve high-quality, fast models EvaluationsMeasure model quality Together ChatChat app for open-source AI

Code Execution

Code SandboxBuild AI development environments Code InterpreterExecute LLM-generated code

Tools

Which LLM to UseFind the ‘right’ model for your use case

Models

See all models →

OpenAI gpt-oss →OpenAI gpt-ossThis is some text inside of a div block. →try it →DeepSeek →DeepSeekThis is some text inside of a div block. →try it →Qwen →QwenThis is some text inside of a div block. →try it →Llama →LlamaThis is some text inside of a div block. →try it →Kimi K2 →Kimi K2This is some text inside of a div block. →try it →Apriel →AprielThis is some text inside of a div block. →try it →

GPU Cloud

GPU Cloud

Clusters of Any Size

Instant ClustersReady to use, self-service GPUs Reserved ClustersDedicated capacity, with expert support Frontier AI Factory1K → 10K → 100K+ NVIDIA GPUs

Cloud Services

Data Center LocationsGlobal GPU power in 25+ cities SlurmCluster management system

GPUs

NVIDIA GB200 NVL72 →NVIDIA GB00 NVL72try it →

NVIDIA HGX B200 →NVIDIA HGX B200try it →

NVIDIA H200 →NVIDIA H200try it →

NVIDIA H100 →NVIDIA H100try it →

Solutions

Solutions

Solutions

Customer StoriesTestimonials from AI pioneers Startup AcceleratorBuild and scale your startup EnterpriseSecure, reliable AI infrastructure Why Open SourceHow to own your AI Industries & Use-CasesScale your business with Together AI

Customer Stories

How Hedra Scales Viral AI Video Generation with 60% Cost Savings

When Standard Inference Frameworks Failed, Together AI Enabled 5x Performance Breakthrough

Developers

Developers

Developers

DocumentationTechnical docs for using Together AI ResearchAdvancing the open-source AI frontier Model LibraryAll our open-source models CookbooksPractical implementation guides Example AppsOur open-source demo apps

Videos

DeepSeek-R1: How It Works, Simplified!Together Code Sandbox: How To Build AI Coding Agents

Pricing

Pricing

Pricing OverviewOur platform & GPU pricing.InferencePer-token & per-minute pricing.Fine-TuningLoRA and full fine-tuning pricing.GPU ClustersHourly rates & custom pricing.

Questions? We’re here to help!Talk to us →

Company

Company

Company

About usGet to know us ValuesOur approach to open-source AI TeamMeet our leadership CareersJoin our mission

Resources

BlogOur latest news & blog posts ResearchAdvancing the open-source AI frontier EventsExplore our events calendar Knowledge BaseFind answers to your questions

Featured content

Together AI and Meta partner to bring PyTorch Reinforcement Learning to the AI Native Cloud

Best practices to accelerate inference for large-scale production workloads

Sign in Contact sales

Chat Docs Blog Support Contact Sales

What would you like to do:

Contact SalesFor inquiries about our products and solutions, connect with the Sales Team Help CenterCheck out our constantly expanding knowledge base and get expert help from our Support Team

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

© 2025 San Francisco, CA 94114

Together.ai