Contact Us | Together AI (original) (raw)

For founders and builders defining the AI-native era. Register now →

🔎 ATLAS: runtime-learning accelerators delivering up to 4x faster LLM inference →

⚡ Together Instant Clusters: self-service NVIDIA GPUs, now generally available →

📦 Batch Inference API: Process billions of tokens at 50% lower cost for most models →

🪛 Fine-Tuning Platform Upgrades: Larger Models, Longer Contexts →
Model Platform
Model Platform
Products
Serverless InferenceAPI for inference on open-source modelsDedicated EndpointsDeploy models on custom hardwareFine-TuningTrain & improve high-quality, fast modelsEvaluationsMeasure model qualityTogether ChatChat app for open-source AI
Code Execution
Code SandboxBuild AI development environmentsCode InterpreterExecute LLM-generated code
Tools
Which LLM to UseFind the ‘right’ model for your use case
Models
OpenAI gpt-oss →OpenAI gpt-ossThis is some text inside of a div block. →try it →
DeepSeek →DeepSeekThis is some text inside of a div block. →try it →
Qwen →QwenThis is some text inside of a div block. →try it →
Llama →LlamaThis is some text inside of a div block. →try it →
Kimi K2 →Kimi K2This is some text inside of a div block. →try it →
Apriel →AprielThis is some text inside of a div block. →try it →
GPU Cloud
GPU Cloud
Clusters of Any Size
Instant ClustersReady to use, self-service GPUsReserved ClustersDedicated capacity, with expert supportFrontier AI Factory1K → 10K → 100K+ NVIDIA GPUs
Cloud Services
Data Center LocationsGlobal GPU power in 25+ citiesSlurmCluster management system
GPUs
NVIDIA GB200 NVL72 →NVIDIA GB00 NVL72try it →
NVIDIA HGX B200 →NVIDIA HGX B200try it →
NVIDIA H200 →NVIDIA H200try it →
NVIDIA H100 →NVIDIA H100try it →
Solutions
Solutions
Solutions
Customer StoriesTestimonials from AI pioneersStartup AcceleratorBuild and scale your startupEnterpriseSecure, reliable AI infrastructureWhy Open SourceHow to own your AIIndustries & Use-CasesScale your business with Together AI
Customer Stories
How Hedra Scales Viral AI Video Generation with 60% Cost Savings
When Standard Inference Frameworks Failed, Together AI Enabled 5x Performance Breakthrough
Developers
Developers
Developers
DocumentationTechnical docs for using Together AIResearchAdvancing the open-source AI frontierModel LibraryAll our open-source modelsCookbooksPractical implementation guidesExample AppsOur open-source demo apps
Videos
DeepSeek-R1: How It Works, Simplified!
Together Code Sandbox: How To Build AI Coding Agents
Pricing
Pricing
Pricing OverviewOur platform & GPU pricing.InferencePer-token & per-minute pricing.Fine-TuningLoRA and full fine-tuning pricing.GPU ClustersHourly rates & custom pricing.
Questions? We’re here to help!Talk to us →
Company
Company
Company
About usGet to know usValuesOur approach to open-source AITeamMeet our leadershipCareersJoin our mission
Resources
BlogOur latest news & blog postsResearchAdvancing the open-source AI frontierEventsExplore our events calendarKnowledge BaseFind answers to your questions
Featured content
Together AI and Meta partner to bring PyTorch Reinforcement Learning to the AI Native Cloud
Best practices to accelerate inference for large-scale production workloads
ChatDocsBlogSupportContact Sales
What would you like to do:
Contact SalesFor inquiries about our products and solutions, connect with the Sales TeamHelp CenterCheck out our constantly expanding knowledge base and get expert help from our Support Team
Subscribe to newsletter
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
© 2025 San Francisco, CA 94114