Contact Us | Together AI (original) (raw)

For founders and builders defining the AI-native era. Register now →

🔎 ATLAS: runtime-learning accelerators delivering up to 4x faster LLM inference →

⚡ Together Instant Clusters: self-service NVIDIA GPUs, now generally available →

📦 Batch Inference API: Process billions of tokens at 50% lower cost for most models →

🪛 Fine-Tuning Platform Upgrades: Larger Models, Longer Contexts →

Model Platform

Model Platform

Products

Serverless InferenceAPI for inference on open-source modelsDedicated EndpointsDeploy models on custom hardwareFine-TuningTrain & improve high-quality, fast modelsEvaluationsMeasure model qualityTogether ChatChat app for open-source AI

Code Execution

Code SandboxBuild AI development environmentsCode InterpreterExecute LLM-generated code

Tools

Which LLM to UseFind the ‘right’ model for your use case

Models

See all models →

OpenAI gpt-oss →OpenAI gpt-ossThis is some text inside of a div block. →try it →DeepSeek →DeepSeekThis is some text inside of a div block. →try it →Qwen →QwenThis is some text inside of a div block. →try it →Llama →LlamaThis is some text inside of a div block. →try it →Kimi K2 →Kimi K2This is some text inside of a div block. →try it →Apriel →AprielThis is some text inside of a div block. →try it →

GPU Cloud

GPU Cloud

Clusters of Any Size

Instant ClustersReady to use, self-service GPUsReserved ClustersDedicated capacity, with expert supportFrontier AI Factory1K → 10K → 100K+ NVIDIA GPUs

Cloud Services

Data Center LocationsGlobal GPU power in 25+ citiesSlurmCluster management system

GPUs

NVIDIA GB200 NVL72 →NVIDIA GB00 NVL72try it →

NVIDIA HGX B200 →NVIDIA HGX B200try it →

NVIDIA H200 →NVIDIA H200try it →

NVIDIA H100 →NVIDIA H100try it →

Solutions

Solutions

Solutions

Customer StoriesTestimonials from AI pioneersStartup AcceleratorBuild and scale your startupEnterpriseSecure, reliable AI infrastructureWhy Open SourceHow to own your AIIndustries & Use-CasesScale your business with Together AI

Customer Stories

How Hedra Scales Viral AI Video Generation with 60% Cost Savings

When Standard Inference Frameworks Failed, Together AI Enabled 5x Performance Breakthrough

Developers

Developers

Developers

DocumentationTechnical docs for using Together AIResearchAdvancing the open-source AI frontierModel LibraryAll our open-source modelsCookbooksPractical implementation guidesExample AppsOur open-source demo apps

Videos

DeepSeek-R1: How It Works, Simplified!Together Code Sandbox: How To Build AI Coding Agents

Pricing

Pricing

Pricing

Pricing OverviewOur platform & GPU pricing.InferencePer-token & per-minute pricing.Fine-TuningLoRA and full fine-tuning pricing.GPU ClustersHourly rates & custom pricing.

Questions? We’re here to help!Talk to us →

Company

Company

Company

About usGet to know usValuesOur approach to open-source AITeamMeet our leadershipCareersJoin our mission

Resources

BlogOur latest news & blog postsResearchAdvancing the open-source AI frontierEventsExplore our events calendarKnowledge BaseFind answers to your questions

Featured content

Together AI and Meta partner to bring PyTorch Reinforcement Learning to the AI Native Cloud

Best practices to accelerate inference for large-scale production workloads

Sign inContact sales

ChatDocsBlogSupportContact Sales

What would you like to do:

Contact SalesFor inquiries about our products and solutions, connect with the Sales TeamHelp CenterCheck out our constantly expanding knowledge base and get expert help from our Support Team

Subscribe to newsletter

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

© 2025 San Francisco, CA 94114

Together.ai