Novita AI – Model Libraries & GPU Cloud - Deploy, Scale & Innovate (original) (raw)

Large Language Models

Browse our supported open source models and deploy in dedicated endpoints

New

GLM 5.2GLM 5.2

$1.4/MtInput

$0.26/MtCache Read

$4.4/MtOutput

1048576Context

131072Max Output

LLMServerless

New

MoonshotAI

Kimi K2.7 Code

$0.95/MtInput

$0.19/MtCache Read

$4/MtOutput

262144Context

262144Max Output

LLMServerless

New

MiniMax-M3MiniMax-M3

$0.3/MtInput

$0.06/MtCache Read

$1.2/MtOutput

1000000Context

131072Max Output

LLMServerless

New

Deepseek V4 ProDeepseek V4 Pro

$1.6/MtInput

$0.135/MtCache Read

$3.2/MtOutput

1048576Context

393216Max Output

LLMServerless

New

Deepseek V4 FlashDeepseek V4 Flash

$0.14/MtInput

$0.028/MtCache Read

$0.28/MtOutput

1048576Context

393216Max Output

LLMServerless

Hot

Deepseek V3.2Deepseek V3.2

$0.269/MtInput

$0.1345/MtCache Read

$0.4/MtOutput

163840Context

65536Max Output

LLMServerless

New

Step-3.7-FlashStep-3.7-Flash

$0.2/MtInput

$0.04/MtCache Read

$1.15/MtOutput

262144Context

256000Max Output

LLMServerless

New

Nemotron 3 Nano 30B A3BNemotron 3 Nano 30B A3B

$0.05/MtInput

$0.2/MtOutput

262144Context

32768Max Output

LLMServerless

New

Wenxin

CoBuddy

$0.28/MtInput

$0.07/MtCache Read

$1.13/MtOutput

131072Context

65536Max Output

LLMServerless

New

XXiaomiMiMo/MiMo-V2.5

$0.168/MtInput

$0.0034/MtCache Read

$0.336/MtOutput

1048576Context

131072Max Output

LLMServerless

NewLIMITED TIME 50% OFF

Qwen3.7-MaxQwen3.7-Max

$1.25/MtInput

$0.25/MtCache Read

$3.75/MtOutput

1000000Context

65536Max Output

LLMServerless

New

XXiaomiMiMo/MiMo-V2.5-Pro

$0.522/MtInput

$0.0043/MtCache Read

$1.044/MtOutput

1048576Context

131072Max Output

LLMServerless

New

Qwen3.6-27BQwen3.6-27B

$0.6/MtInput

$3.6/MtOutput

262144Context

65536Max Output

LLMServerless

New

MoonshotAI

Kimi K2.6

$0.8/MtInput

$0.16/MtCache Read

$3.4/MtOutput

262144Context

262144Max Output

LLMServerless

New

GLM-5.1GLM-5.1

$1.38/MtInput

$0.26/MtCache Read

$4.4/MtOutput

204800Context

131072Max Output

LLMServerless

Gemma 4 26B A4BGemma 4 26B A4B

$0.13/MtInput

$0.4/MtOutput

262144Context

131072Max Output

LLMServerless

Gemma 4 31BGemma 4 31B

$0.14/MtInput

$0.4/MtOutput

262144Context

131072Max Output

LLMServerless

New

MiniMax M2.7MiniMax M2.7

$0.3/MtInput

$0.06/MtCache Read

$1.2/MtOutput

204800Context

131072Max Output

LLMServerless

MiniMax M2.5-highspeedMiniMax M2.5-highspeed

$0.6/MtInput

$0.03/MtCache Read

$2.4/MtOutput

204800Context

131100Max Output

LLMServerless

Qwen3.5-27BQwen3.5-27B

$0.3/MtInput

$2.4/MtOutput

262144Context

65536Max Output

LLMServerless

Qwen3.5-122B-A10BQwen3.5-122B-A10B

$0.4/MtInput

$3.2/MtOutput

262144Context

65536Max Output

LLMServerless

Qwen3.5-35B-A3BQwen3.5-35B-A3B

$0.25/MtInput

$2/MtOutput

262144Context

65536Max Output

LLMServerless

Qwen3.5-397B-A17BQwen3.5-397B-A17B

$0.6/MtInput

$3.6/MtOutput

262144Context

65536Max Output

LLMServerless

MiniMax M2.5MiniMax M2.5

$0.3/MtInput

$0.03/MtCache Read

$1.2/MtOutput

204800Context

131100Max Output

LLMServerless

GLM-5GLM-5

$1/MtInput

$0.2/MtCache Read

$3.2/MtOutput

202800Context

131072Max Output

LLMServerless

Qwen3 Coder NextQwen3 Coder Next

$0.2/MtInput

$1.5/MtOutput

262144Context

65536Max Output

LLMServerless

DeepSeek-OCR 2DeepSeek-OCR 2

$0.03/MtInput

$0.03/MtOutput

8192Context

8192Max Output

LLMServerless

MoonshotAI

Kimi K2.5

$0.6/MtInput

$0.1/MtCache Read

$3/MtOutput

262144Context

262144Max Output

LLMServerless

GLM-4.7-FlashGLM-4.7-Flash

$0.07/MtInput

$0.01/MtCache Read

$0.4/MtOutput

200000Context

128000Max Output

LLMServerless

Minimax M2.1Minimax M2.1

$0.3/MtInput

$0.03/MtCache Read

$1.2/MtOutput

204800Context

131072Max Output

LLMServerless

GLM-4.7GLM-4.7

$0.6/MtInput

$0.11/MtCache Read

$2.2/MtOutput

204800Context

131072Max Output

LLMServerless

AutoGLM-Phone-9B-MultilingualAutoGLM-Phone-9B-Multilingual

$0.035/MtInput

$0.138/MtOutput

65536Context

65536Max Output

LLMServerless

MoonshotAI

Kimi K2 Thinking

$0.6/MtInput

$0.15/MtCache Read

$2.5/MtOutput

262144Context

262144Max Output

LLMServerless

MiniMax-M2MiniMax-M2

$0.3/MtInput

$0.03/MtCache Read

$1.2/MtOutput

204800Context

131072Max Output

LLMServerless

PaddleOCR-VLPaddleOCR-VL

$0.02/MtInput

$0.02/MtOutput

16384Context

16384Max Output

LLM

Deepseek V3.2 ExpDeepseek V3.2 Exp

$0.27/MtInput

$0.41/MtOutput

163840Context

65536Max Output

LLMServerless

Qwen3 VL 235B A22B ThinkingQwen3 VL 235B A22B Thinking

$0.98/MtInput

$3.95/MtOutput

131072Context

32768Max Output

LLMServerless

GLM 4.6VGLM 4.6V

$0.3/MtInput

$0.055/MtCache Read

$0.9/MtOutput

131072Context

32768Max Output

LLMServerless

GLM 4.6GLM 4.6

$0.55/MtInput

$0.11/MtCache Read

$2.2/MtOutput

204800Context

131072Max Output

LLMServerless

New

Qwen3.6-35B-A3BQwen3.6-35B-A3B

$0.248/MtInput

$1.485/MtOutput

262144Context

65536Max Output

LLMServerless

Kat Coder ProKat Coder Pro

$0.3/MtInput

$0.06/MtCache Read

$1.2/MtOutput

256000Context

128000Max Output

LLMServerless

Qwen3 Next 80B A3B InstructQwen3 Next 80B A3B Instruct

$0.15/MtInput

$1.5/MtOutput

131072Context

32768Max Output

LLMServerless

Qwen3 Next 80B A3B ThinkingQwen3 Next 80B A3B Thinking

$0.15/MtInput

$1.5/MtOutput

131072Context

32768Max Output

LLMServerless

DeepSeek-OCRDeepSeek-OCR

$0.03/MtInput

$0.03/MtOutput

8192Context

8192Max Output

LLM

Deepseek V3.1 TerminusDeepseek V3.1 Terminus

$0.27/MtInput

$0.135/MtCache Read

$1/MtOutput

131072Context

32768Max Output

LLMServerless

Qwen3 VL 235B A22B InstructQwen3 VL 235B A22B Instruct

$0.3/MtInput

$1.5/MtOutput

131072Context

32768Max Output

LLMServerless

Qwen3 MaxQwen3 Max

$2.11/MtInput

$8.45/MtOutput

262144Context

65536Max Output

LLMServerless

NewLimited Time Free

Nex-N2-ProNex-N2-Pro

$0/MtInput

$0/MtOutput

262144Context

262144Max Output

LLMServerless

DeepSeek V3.1DeepSeek V3.1

$0.27/MtInput

$0.135/MtCache Read

$1/MtOutput

131072Context

32768Max Output

LLMServerless

MoonshotAI

Kimi K2 0905

$0.6/MtInput

$2.5/MtOutput

262144Context

262144Max Output

LLMServerless

Qwen3 Coder 480B A35B InstructQwen3 Coder 480B A35B Instruct

$0.38/MtInput

$1.55/MtOutput

262144Context

65536Max Output

LLMServerless

Qwen3 Coder 30b A3B InstructQwen3 Coder 30b A3B Instruct

$0.07/MtInput

$0.27/MtOutput

160000Context

32768Max Output

LLMServerless

OpenAI

OpenAI GPT OSS 120B

$0.05/MtInput

$0.25/MtOutput

131072Context

32768Max Output

LLMServerless

MoonshotAI

Kimi K2 Instruct

$0.57/MtInput

$2.3/MtOutput

131072Context

32768Max Output

LLMServerless

Hot

DeepSeek V3 0324DeepSeek V3 0324

$0.27/MtInput

$0.135/MtCache Read

$1.12/MtOutput

163840Context

65536Max Output

LLMServerless

GLM-4.5GLM-4.5

$0.6/MtInput

$0.11/MtCache Read

$2.2/MtOutput

131072Context

98304Max Output

LLMServerless

Qwen3 235B A22b Thinking 2507Qwen3 235B A22b Thinking 2507

$0.3/MtInput

$3/MtOutput

131072Context

32768Max Output

LLMServerless

Llama 3.1 8B InstructLlama 3.1 8B Instruct

$0.02/MtInput

$0.05/MtOutput

16384Context

16384Max Output

LLMServerless

Gemma3 12BGemma3 12B

$0.05/MtInput

$0.1/MtOutput

131072Context

8192Max Output

LLM

GLM 4.5VGLM 4.5V

$0.6/MtInput

$0.11/MtCache Read

$1.8/MtOutput

65536Context

16384Max Output

LLMServerless

OpenAI

OpenAI: GPT OSS 20B

$0.04/MtInput

$0.15/MtOutput

131072Context

32768Max Output

LLMServerless

Qwen3 235B A22B Instruct 2507Qwen3 235B A22B Instruct 2507

$0.09/MtInput

$0.58/MtOutput

131072Context

16384Max Output

LLMServerless

DeepSeek R1 Distill Qwen 14BDeepSeek R1 Distill Qwen 14B

$0.15/MtInput

$0.15/MtOutput

32768Context

16384Max Output

LLM

Llama 3.3 70B InstructLlama 3.3 70B Instruct

$0.135/MtInput

$0.4/MtOutput

131072Context

120000Max Output

LLMServerless

Qwen 2.5 72B InstructQwen 2.5 72B Instruct

$0.38/MtInput

$0.4/MtOutput

32000Context

8192Max Output

LLMServerless

Mistral

Mistral Nemo

$0.04/MtInput

$0.17/MtOutput

60288Context

16000Max Output

LLMServerless

MiniMax M1MiniMax M1

$0.55/MtInput

$2.2/MtOutput

1000000Context

40000Max Output

LLMServerless

DeepSeek R1 0528DeepSeek R1 0528

$0.7/MtInput

$0.35/MtCache Read

$2.5/MtOutput

163840Context

32768Max Output

LLMServerless

DeepSeek R1 Distill Qwen 32BDeepSeek R1 Distill Qwen 32B

$0.3/MtInput

$0.3/MtOutput

64000Context

32000Max Output

LLM

Llama 3 8B InstructLlama 3 8B Instruct

$0.04/MtInput

$0.04/MtOutput

8192Context

8192Max Output

LLMServerless

Azure

Wizardlm 2 8x22B

$0.62/MtInput

$0.62/MtOutput

65535Context

8000Max Output

LLMServerless

Dedicated

DeepSeek R1 0528 Qwen3 8BDeepSeek R1 0528 Qwen3 8B

$0.06/MtInput

$0.09/MtOutput

128000Context

32000Max Output

LLM

DeepSeek R1 Distill LLama 70BDeepSeek R1 Distill LLama 70B

$0.8/MtInput

$0.8/MtOutput

8192Context

8192Max Output

LLMServerless

Llama3 70B InstructLlama3 70B Instruct

$0.51/MtInput

$0.74/MtOutput

8192Context

8000Max Output

LLMServerless

Qwen3 235B A22BQwen3 235B A22B

$0.2/MtInput

$0.8/MtOutput

40960Context

20000Max Output

LLMServerless

Llama 4 Maverick InstructLlama 4 Maverick Instruct

$0.27/MtInput

$0.85/MtOutput

1048576Context

8192Max Output

LLMServerless

Dedicated

Llama 4 Scout InstructLlama 4 Scout Instruct

$0.18/MtInput

$0.59/MtOutput

131072Context

131072Max Output

LLMServerless

Hermes 2 Pro Llama 3 8BHermes 2 Pro Llama 3 8B

$0.14/MtInput

$0.14/MtOutput

8192Context

8192Max Output

LLM

L3 70B Euryale V2.1	L3 70B Euryale V2.1

$1.48/MtInput

$1.48/MtOutput

8192Context

8192Max Output

LLM

Sao10k L3 8B Lunaris	Sao10k L3 8B Lunaris

$0.05/MtInput

$0.05/MtOutput

8192Context

8192Max Output

LLMServerless

Baichuan

BaiChuan M2 32B

$0.07/MtInput

$0.07/MtOutput

131072Context

131072Max Output

LLM

Wenxin

ERNIE 4.5 VL 424B A47B

$0.42/MtInput

$1.25/MtOutput

123000Context

16000Max Output

LLMServerless

Deepseek Prover V2 671BDeepseek Prover V2 671B

$0.7/MtInput

$2.5/MtOutput

160000Context

160000Max Output

LLMServerless

Qwen3 32BQwen3 32B

$0.1/MtInput

$0.45/MtOutput

40960Context

20000Max Output

LLM

Gemma 3 27BGemma 3 27B

$0.119/MtInput

$0.2/MtOutput

98304Context

16384Max Output

LLMServerless

DeepSeek V3 (Turbo)	DeepSeek V3 (Turbo)

$0.4/MtInput

$1.3/MtOutput

64000Context

16000Max Output

LLMServerless

DeepSeek R1 (Turbo)	DeepSeek R1 (Turbo)

$0.7/MtInput

$2.5/MtOutput

64000Context

16000Max Output

LLMServerless

L3 8B Stheno V3.2L3 8B Stheno V3.2

$0.05/MtInput

$0.05/MtOutput

8192Context

32000Max Output

LLMServerless

MMythomax L2 13B

$0.09/MtInput

$0.09/MtOutput

4096Context

3200Max Output

LLM

RRing-2.6-1T

$0.3/MtInput

$0.06/MtCache Read

$2.5/MtOutput

262144Context

65536Max Output

LLMServerless

LLing-2.6-flash

$0.1/MtInput

$0.02/MtCache Read

$0.3/MtOutput

262144Context

32768Max Output

LLMServerless

LLing-2.6-1T

$0.3/MtInput

$0.06/MtCache Read

$2.5/MtOutput

262144Context

32768Max Output

LLMServerless

LLing-2.6-flash

$0.1/MtInput

$0.02/MtCache Read

$0.3/MtOutput

262144Context

32768Max Output

LLMServerless

qwen/qwen3-vl-8b-instructqwen/qwen3-vl-8b-instruct

$0.08/MtInput

$0.5/MtOutput

131072Context

32768Max Output

LLMServerless

zai-org/glm-4.5-airzai-org/glm-4.5-air

$0.13/MtInput

$0.025/MtCache Read

$0.85/MtOutput

131072Context

98304Max Output

LLMServerless

qwen/qwen3-vl-30b-a3b-instructqwen/qwen3-vl-30b-a3b-instruct

$0.2/MtInput

$0.7/MtOutput

131072Context

32768Max Output

LLMServerless

qwen/qwen3-vl-30b-a3b-thinkingqwen/qwen3-vl-30b-a3b-thinking

$0.2/MtInput

$1/MtOutput

131072Context

32768Max Output

LLMServerless

Qwen3 Omni 30B A3B ThinkingQwen3 Omni 30B A3B Thinking

$0.25/MtInput

$0.97/MtOutput

65536Context

16384Max Output

LLMServerless

Qwen3 Omni 30B A3B InstructQwen3 Omni 30B A3B Instruct

$0.25/MtInput

$0.97/MtOutput

65536Context

16384Max Output

LLMServerless

Qwen MT PlusQwen MT Plus

$0.25/MtInput

$0.75/MtOutput

16384Context

8192Max Output

LLMServerless

Wenxin

ERNIE 4.5 VL 28B A3B

$0.14/MtInput

$0.56/MtOutput

30000Context

8000Max Output

LLMServerless

Wenxin

ERNIE 4.5 21B A3B

$0.07/MtInput

$0.28/MtOutput

120000Context

8000Max Output

LLMServerless

Dedicated

Qwen3 8BQwen3 8B

$0.035/MtInput

$0.138/MtOutput

128000Context

20000Max Output

LLM

Llama 3.2 3B InstructLlama 3.2 3B Instruct

$0.03/MtInput

$0.05/MtOutput

32768Context

32000Max Output

LLM

L31 70B Euryale V2.2L31 70B Euryale V2.2

$1.48/MtInput

$1.48/MtOutput

8192Context

8192Max Output

LLMServerless