Choosing Your Model - Factory Documentation (original) (raw)

Balance accuracy, speed, and cost by picking the right model and reasoning level for each task. Every model we offer meets proprietary quality and cost-efficiency requirements. Model quality evolves quickly, and we tune the CLI defaults as the ecosystem shifts. Use this guide as a snapshot of how the major options compare today, and expect to revisit it as we publish updates. This guide was last updated on Wednesday, June 3rd 2026.


1 · Current stack rank (June 2026)

Rank Model Why we reach for it
1 Claude Fable 5*† First public release Mythos model; top pick for multi-step autonomous engineering work. 4× multiplier.
2 Claude Opus 4.8 Newest Anthropic flagship with Max reasoning; top pick for hardest work. 2× multiplier.
3 Claude Opus 4.8 Fast Opus 4.8 tuned for faster response times; 4× multiplier.
4 Claude Opus 4.7 Anthropic flagship with Max reasoning; top pick for the hardest work. 2× multiplier.
5 Claude Opus 4.7 Fast Opus 4.7 tuned for faster response times; 12× multiplier.
6 Claude Opus 4.6 Previous Anthropic flagship with Max reasoning; still excellent depth and safety for complex work.
7 Claude Opus 4.6 Fast Opus 4.6 tuned for faster response times; 12× multiplier.
8 Claude Opus 4.5 Proven quality-and-safety balance; strong default for TUI and exec.
9 Claude Sonnet 4.6 Max reasoning at the Sonnet price point (1.2×); strong daily driver for planning and implementation.
10 GPT-5.4 Latest OpenAI model with 922K context, 128K output, verbosity support, and Extra High reasoning; excellent for large-context tasks.
11 Claude Sonnet 4.5 Strong daily driver with balanced cost/quality; great general-purpose choice when you don’t need Opus-level depth.
12 GPT-5.3-Codex Newest OpenAI coding model with Extra High reasoning and verbosity support; strong for implementation-heavy tasks.
13 GPT-5.2-Codex Proven OpenAI coding model with Extra High reasoning; solid for implementation-heavy tasks.
14 GPT-5.2 OpenAI model with verbosity support and reasoning up to Extra High.
15 Claude Haiku 4.5 Fast, cost-efficient for routine tasks and high-volume automation.
16 Gemini 3.1 Pro Newer Gemini Pro generation with strong structured outputs and mixed reasoning controls for research-heavy tasks.
17 Gemini 3 Flash Fast, cheap (0.2× multiplier) with full reasoning support; great for high-volume tasks where speed matters.
18 Droid Core (NVIDIA Nemotron 3 Ultra) Open-source, 0.4× multiplier, fast and cost-efficient model for high-frequency automations; no image support.
19 Droid Core (MiniMax M3) Open-source, 0.12× multiplier with multimodal image support and strong agentic capabilities; dependable workhorse for everyday tasks with extended context.
20 Droid Core (MiniMax M2.7) Open-source, 0.12× multiplier with reasoning support (Low/Medium/High) and image support; cheapest model available.
21 Droid Core (GLM-5.1) Open-source, 0.55× multiplier, newer GLM option for bulk automation and air-gapped environments; no image support.
22 Droid Core (GLM-5) Open-source, 0.4× multiplier, stable choice for bulk automation and air-gapped environments; no image support.
23 Droid Core (Kimi K2.7 Code) Coding-specialized Kimi at 0.4× with optional High reasoning; strong agentic and tool use for cost-sensitive dev work.
24 Droid Core (Kimi K2.6) Open-source, 0.4× multiplier with image support and optional High reasoning; good for cost-sensitive work when you still want a thinking toggle.
25 Droid Core (Kimi K2.5) Open-source, 0.25× multiplier with image support; older Kimi option for cost-sensitive work.

* Anthropic requires all Mythos-class models comply with 30 day data retention for trust and safety, please see more information here. † As of June 12, 2026, Claude Fable 5 is not currently available—see Anthropic’s update on Mythos access here.


2 · Match the model to the job

Scenario Recommended model
Deep planning, architecture reviews, ambiguous product specs Start with Opus 4.7 for best depth and safety (1× promotional multiplier through April 30), or fall back to Opus 4.6 / Opus 4.6 Fast for faster turnaround. Use Sonnet 4.6 or Sonnet 4.5 when you want balanced cost/quality, or GPT-5.4 for large-context reasoning.
Full-feature development, large refactors Opus 4.7 or Opus 4.6 for depth and safety. GPT-5.4, GPT-5.3-Codex, or GPT-5.2-Codex when you need speed plus Extra High reasoning; Sonnet 4.6 or Sonnet 4.5 for balanced loops.
Repeatable edits, summarization, boilerplate generation Haiku 4.5 or Droid Core (including MiniMax M2.7 at 0.12×) for speed and cost. GPT-5.2 when you need higher quality or structured outputs.
CI/CD or automation loops Favor Haiku 4.5 or Droid Core for predictable, low-cost throughput. Use GPT-5.3-Codex or GPT-5.4 when automation needs stronger reasoning.
High-volume automation, frequent quick turns Haiku 4.5 for speedy feedback. Droid Core (especially MiniMax M2.7 at 0.12× with reasoning) when cost is critical or you need air-gapped deployment.

Tip: you can swap models mid-session with /model or by toggling in the settings panel (Shift+TabSettings).


3 · Switching models mid-session


4 · Reasoning effort settings

Reasoning effort increases latency and cost—start low for simple work and escalate as needed. Max is available on Claude Opus 4.7, the Opus 4.6 family (Opus 4.6 and Opus 4.6 Fast), and Sonnet 4.6. Extra High is available on GPT-5.4, GPT-5.2, GPT-5.2-Codex, and GPT-5.3-Codex.


5 · Bring Your Own Keys (BYOK)

Factory ships with managed Anthropic and OpenAI access. If you prefer to run against your own accounts, BYOK is opt-in—see Bring Your Own Keys for setup steps, supported providers, and billing notes.

Open-source models

Droid Core (GLM-5), Droid Core (GLM-5.1), Droid Core (Kimi K2.6), Droid Core (Kimi K2.5), and Droid Core (MiniMax M2.7) are open-source alternatives available in the CLI. They’re useful for:

Note: GLM-5 and GLM-5.1 do not support image attachments. Kimi K2.5, Kimi K2.6, and MiniMax M2.7 do support images. Kimi K2.6 adds an Off/High reasoning toggle, while MiniMax M2.7 (the cheapest model available, with 0.12× multiplier) supports Low/Medium/High reasoning. For image-based workflows, use Claude, GPT, Kimi, or MiniMax M2.7. To use open-source models, you’ll need to configure them via BYOK with a local inference server (like Ollama) or a hosted provider. See BYOK documentation for setup instructions.


6 · Keep notes on what works