NVIDIA Vera CPU (original) (raw)

Built for the World’s AI Factories

NVIDIA Vera is built for the CPU work behind agentic AI and reinforcement learning (RL), including code execution, tool use, sandboxing, analytics, data pipelines, and orchestration beyond the model. As both a host CPU for accelerated systems and a standalone CPU for AI factory workloads, Vera keeps GPUs fed, agents responsive, and training loops moving. With fast, energy-efficient NVIDIA Olympus cores and high-bandwidth LPDDR5X memory, Vera delivers up to 80 percent faster sandbox environment performance than traditional CPU infrastructure, helping AI factories generate more tokens per dollar.

NVIDIA Unveils Vera, the CPU for Agents

NVIDIA launched high-performance, energy-efficient NVIDIA Vera CPUs to drive diverse workloads across industries, including agentic AI, reinforcement learning, and data processing.

Vera Arrives: NVIDIA’s First CPU Built for Agents Lands at Top AI Labs

Ian Buck, NVIDIA VP of hyperscale and high-performance computing (HPC), hand-delivered the first NVIDIA Vera CPU systems to Anthropic, OpenAI, Oracle Cloud Infrastructure, and SpaceXAI—marking the moment agentic CPUs moved from announcement to production.

NVIDIA Vera CPU Rack

The NVIDIA Vera CPU Rack powers reinforcement learning and agentic AI at AI factory scale. Built on NVIDIA MGX™, it integrates up to 256 Vera CPUs to run over 22.5K concurrent environments.

Industry-Leading Agentic CPU Performance

1.8x Faster Agentic Sandbox Performance

Agentic AI is bottlenecked by traditional CPUs. Across an agent's reasoning loop, the CPU compiles generated code, runs Python tool chains, and analyzes software code. NVIDIA Vera accelerates all three workloads by up to 1.8x over leading x86 CPUs, turbocharging the agentic inner loop to maximize AI factory output.

High-Bandwidth, Energy-Efficient Memory

Traditional DDR5 forces a tradeoff between bandwidth, efficiency, and serviceability. NVIDIA Vera pairs LPDDR5X memory with SOCAMM, detachable, field-replaceable modules that deliver low-power (LP) efficiency with server-class flexibility and upgradable capacity. The result is 2x the bandwidth, 3x the bandwidth per core of leading x86 CPUs with DDR5, unlocking greater AI factory output at hyperscale and enterprise scale.

1.8x Faster Agentic Sandbox Performance

Agentic AI is bottlenecked by traditional CPUs. Across an agent's reasoning loop, the CPU compiles generated code, runs Python tool chains, and analyzes software code. NVIDIA Vera accelerates all three workloads by up to 1.8x over leading x86 CPUs, turbocharging the agentic inner loop to maximize AI factory output.

Relative performance based on measured data, and subject to change. NVIDIA Vera CPU with LPDDR5X performance baselined to latest generation x86 CPU.

Relative performance based on measured data and subject to change. NVIDIA Vera CPU with LPDDR5X performance baselined to latest generation x86 CPU with DDR5 across key CPU memory performance metrics.

High-Bandwidth, Energy-Efficient Memory

Traditional DDR5 forces a tradeoff between bandwidth, efficiency, and serviceability. NVIDIA Vera pairs LPDDR5X memory with SOCAMM, detachable, field-replaceable modules that deliver low-power (LP) efficiency with server-class flexibility and upgradable capacity. The result is 2x the bandwidth, 3x the bandwidth per core of leading x86 CPUs with DDR5, unlocking greater AI factory output at hyperscale and enterprise scale.

Designed for AI Factories

Host CPU

NVIDIA Vera delivers system-level efficiency as the host CPU for AI factories, including NVIDIA Vera Rubin NVL72 and HGX™ Vera Rubin NVL8 platforms. Vera feeds GPUs for large-scale AI while running the CPU work that keeps the factory operating, including ETL, key-value (KV) cache management, and orchestration. With high single-threaded performance, massive memory bandwidth, and a single compute die design that avoids cross-chiplet latency, Vera delivers predictable performance while keeping GPUs fully utilized across accelerated AI and HPC systems.

Standalone CPU

For agentic AI, reinforcement learning, data processing, and analytics, NVIDIA Vera delivers leading per-core performance and massive memory bandwidth to run thousands of parallel sandbox environments, tool calls, code executions, evaluation loops, and data workflows. Faster CPU execution means agents wait less, RL systems generate more feedback per training step, and AI factories produce more tokens per dollar. As a standalone CPU platform, Vera also supports hyperscale cloud, enterprise, and HPC workloads and extends to storage infrastructure with NVIDIA Vera BlueField®-4 STX. Available as a dense, liquid-cooled NVIDIA Vera CPU rack or in standard dual- and single-socket configurations, Vera fits any data center.

Explore the Technological Breakthroughs

Built for the demands of reinforcement learning and agentic AI, NVIDIA Vera combines custom-designed Olympus cores, high-bandwidth LPDDR5X memory, and low-latency NVIDIA Scalable Coherency Fabric (SCF). With NVIDIA NVLink™-C2C connectivity, confidential computing, and full Arm® compatibility, Vera extends across accelerated systems and modern data center environments. Its monolithic compute architecture keeps software environments responsive and data moving efficiently, helping to maximize throughput, energy efficiency, and GPU utilization across AI, analytics, and HPC workloads.

NVIDIA Custom Olympus Cores

NVIDIA Vera features 88 custom Olympus cores built for the control-heavy, latency-sensitive work behind agentic AI and reinforcement learning. High single-thread performance helps software environments, tool calls, and evaluation loops complete faster, while NVIDIA Spatial Multithreading creates 176 threads with partitioned core resources for predictable throughput at scale.

LPDDR5X Memory Subsystem

NVIDIA Vera delivers up to 1.2 terabytes per second (TB/s) of LPDDR5X memory bandwidth, providing 2x the bandwidth at half the power of traditional CPU memory. This keeps thousands of parallel software environments responsive while supporting faster RL iterations, efficient KV-cache management, and data-intensive agentic workflows. With up to 1.5 TB of memory, Vera provides the capacity and efficiency for AI factories, analytics, and HPC workloads.

Second-Generation NVIDIA SCF

NVIDIA Vera uses second-generation NVIDIA SCF to connect all 88 cores, cache, memory, input and output (IO), and NVLink-C2C across a single compute die. With 3.4 TB/s of bisectional bandwidth and a unified cache architecture, SCF gives cores fast, consistent access to data even when the CPU is fully utilized. By avoiding cross-chiplet communication, Vera maintains predictable latency and throughput for agentic workloads, analytics, and AI factory infrastructure at scale.

NVIDIA NVLink-C2C delivers up to 1.8 TB/s of coherent bandwidth between Vera CPUs and NVIDIA GPUs. When paired with NVIDIA Rubin GPUs, Vera creates a unified memory architecture that helps CPUs and GPUs work together on complex AI and HPC workloads, large datasets, and KV-cache offload. NVLink-C2C reduces data-transfer bottlenecks, simplifies optimization, supports secure isolation for sensitive data and code, and enables high-speed connectivity in dual-socket Vera CPU systems.

NVIDIA Vera Rubin NVL72

NVIDIA Vera Rubin NVL72 unifies leading-edge technologies from NVIDIA: 72 Rubin GPUs, 36 Vera CPUs, ConnectX®-9 SuperNICs, and BlueField-4 DPUs. It scales up intelligence in a rack-scale platform with the NVLink 6 switch and scales out with NVIDIA Quantum-X800 InfiniBand and Spectrum-X™ Ethernet to power the AI industrial revolution.

Stay Up to Date on NVIDIA News

Sign up for the latest news, updates, and more from NVIDIA.

Email Me When Available

Welcome back. Not you? Log Out